Avoid comparing whole volume when receiveing with --sparse - Githubissues

tasket / wyng-backup

Fast backups for logical volumes & disk images

GNU General Public License v3.0

251 stars 16 forks source link

Avoid comparing whole volume when receiveing with --sparse #150

Closed AnounIssues closed 1 year ago

AnounIssues commented 1 year ago

when using "send --sparse" wyng finds changed block super fast and sends them , without reading the enter volume bytes of the source that works perfectly but when trying to receive the sessions from dest to source , even when using --sparse or ---sparse-write (tried both) wyng have to read the entire size of the volume and compare (bytes by bytes i guess ) and pull the changed blocks which makes it a lot slower

i think it should be technically possible to receive fast, i used to use vmware workstation on debian ,it did restore to snapshot in seconds i think all users would be happy if you make it like send --sparse

tasket commented 1 year ago

--sparse was intended to save bandwidth on slow network connections (i.e. if you have a large volume you want to revert, and most of the volume will be the unchanged), and consumes local CPU/disk bandwidth to compensate.

--sparse-write is much simpler... its like dd conv=sparse and the intent is only to save disk space by writing only what is necessary.

Without these options, Wyng will fetch all the volume's data from the archive, and also write all of it to disk.

A "smart sparse" mode could be implemented where the state of the local volume would be compared with snapshots and then compared with the session history in order to avoid reading the volume data during receive. However, this would not work in some cases, for example when restoring to a system that doesn't have the Wyng volume snapshots in place (because you would need those snapshots to indicate if any changes were made to the local volume since the last send).

tasket commented 1 year ago

@AnounIssues I'm thinking this may be do-able in v0.4 without a lot of work, accessed via an option like --use-snapshot. It would only function under the condition of viable local snapshots being present.

BTW...

if you make it like send --sparse

send is always sparse... the --sparse option is for receive.

tasket commented 1 year ago

Implementation details:

Receive will now clone a local snapshot associated with the latest archive session, if available, and use that as its baseline. If the selected session is the latest one, receive will simply finish at that point; if the selected session is earlier, _mergemanifests() will collect the differences between selected and latest and use that list to retrieve & save only the chunks changed between those two sessions. @AnounIssues

tasket commented 1 year ago

In future, --use-snapshot and --sparse may become a single option but for now they are separate.

AnounIssues commented 1 year ago

thansk for paying attenstion to the request and working on it is the last commiit (577fb17) all it needs, or it need more work and it's not complete yet ? i want to merge the commit and try in in my vms

tasket commented 1 year ago

@AnounIssues I've tested it successfully about five times, so it seems to be working. Specify --use-snapshot to enable it with receive. Currently does not fall back automatically to sparse mode if there is no snapshot, but you can add --sparse to the line if you want that to happen.

Keep in mind '04wip' is an unstable branch with lots of things happening, although commit https://github.com/tasket/wyng-backup/commit/577fb17e9590c718b5b14db1000b4342b538ad75 looks pretty good (I'm even using that one for my own backups atm). One more major feature will land in 04wip probably next week and it may become unusable for a while before I test again and create the final alpha branch.