Open Eeems opened 3 years ago
This will not only speed up backup time, but it will also make backups much smaller as the tar files that we are backing up currently don't seem to be deduplicated very well. If I backup the same containers twice they take up similar amounts of space instead of just the changes to the files on the VM. Since I'm paying for backup storage space, this is less than ideal. Plus taking half the day run through all my backups is less than ideal as well.
Upon looking into vzdump, it might require some helper perl scripts to replace the vzdump calls that vzborg is already doing, and instead use them to prepare the container filesystem for backup: snapshot, mount, copy in extra configuration files used by vzrestore. Then run the borg create call against the mounted filesystem, before finally doing cleanup.
Hi Eeems. Thanks for your interest and contributions. I really like the idea of backing up the container filesystem, instead of the vzdump generated tar. Sure that would be better in many ways, and open up new possibilities. I know no perl, but let's by now this issue remain open. May be somebody can contribute to this enhancement
Since I'm very interested in how this will greatly enhance performance of LXC backups, I'll try to take some time to put together some helper scripts for this that can be used. The one concern with this though is that these scripts may need to be updated as proxmox updates as the interface isn't as stable. I think the performance is likely worth it though.
May be this can be initially implemented as an optional way of backing up containers, for those who really need it. A new optional parameter should be necessary for this, and VzBorg should be able to identify this kind of backups may be using a diferent extension (not .tar nor .vma) for restore and listing. Tell me when you are ready, and I will create a special branch for this feature.
Ideally the resulting backup would be indiscernible from the existing method so a normal restore would just work.
Now vzborg in the case of containers generates a borg archive which contains only one file, the tar file generated by vzdump. If we are going to feed borg with the container's full filesystem, the borg archive will contain that, the full filesystem and not just one file. We need to differentiate this two kinds of backups, to know what to do when restoring. If the container has only one mount point (the root) and no ACL, the only necessary change to do to the code when restoring, could just be using borg export-tar instead of borg extract. May be, we can begin trying to support this use case, which for me, will cover almost all of my containers. What do you think?
I think that's a good idea.
Ok. In the following days I will create a new branch for this, and make the first changes. I 'll let you know.
Hi Nathaniel. There is a new branch "filesystem-based-containers-backup" for working on this enhancement. I have done some changes on it, to detect if a container is in our use case (no ACL and no extra mount points). If that is ok, I set the variable vm_ext to 'rfs' (acronym of root file system). Around line 226, there is now some logic to handle this new backup type. I have made also, a small change for restoring this kind of backup properly. I do not know if I can help you with your scripts, but I can test for sure. Let me know whatever you need.
Interested in this, too, but cannot contribute much. I would be happy to donate for this feature, though.
Thanks Alexander for your interest. Just keep in touch by now.
The functionality is now available in the branch filesytem-based-containers-backup.
Please help with testing and bug reporting.
To test, backup your existing /usr/local/bin/vzborg and replace with the one in the branch. Do not forget to make it executable if needed.
You will have to use vzborg backup with the --fs-backup yes option. Example:
vzborg backup -i 101 --fs-backup yes
vzborg will generate this kind of backups with the .rfs extention
Please also test vzborg restore and vzborg getdump with them.
I am attaching some personal statistics comparing vzdump, vzborg standard, and vzborg with --fs-backup yes option, which shows better deduplication, but longer backup times. Please share your results if possible. vzborgStats01.pdf
Longer backup times? It should be much faster if done right. When running borg backup directly against a filesystem I'm able to get down to seconds for a backup if not much has changed. It looks like you are first dumping the files to a folder and then backing them up, instead of just mounting the container filesystem and working against it directly. This is what vzdump itself is doing.
PVE::Storage::activate_volumes($storage_cfg, $volids, 'vzdump');
foreach my $disk (@$disks) {
$disk->{dir} = "${rootdir}$disk->{mp}";
PVE::LXC::mountpoint_mount($disk, $rootdir, $storage_cfg, 'vzdump', $task->{rootuid}, $task->{rootgid});
}
I haven't had time to work on getting those perl scripts to you due to the heatwave that we've had in western Canada. My spare time has been spent trying to stay out of the heat.
Yes, you are right. That is what I am doing. Something very simple and not efficient for sure. Consider this a first approach, just to test the concept.
It would be great if this somehow handled backup up lxc containers based on the actual file contents similar to https://github.com/michabbs/proxborg#how-to-backup-lxc-container-the-better-way this would speed up those backups as it can inspect the individual files instead of doing a backup of a tar file that has to be processed in it's entirety.