mpalmer / lvmsync

Synchronise LVM LVs across a network by sending only snapshotted changes
http://theshed.hezmatt.org/lvmsync
GNU General Public License v3.0
380 stars 60 forks source link

Memory consumption #59

Closed richardland closed 1 year ago

richardland commented 5 years ago

When syncing a LV that has changed a lot (300GB+ for example) during the sending of blocks, the used memory explodes to the point of OOM. It would take huge amounts of memory to send over a large amount of changed blocks.

example command:

lvmsync /dev/CACHED_VG_3bMWMZ/SNAP_VM_STORE_vznodecat_2018_12_16T14_49_59 -r /dev/CACHED_VG_3bMWMZ/SNAP_VM_STORE_vznodecat_2018_12_16T22_02_42 root@10.200.2.2:/dev/CACHED_VG_y4wPTV/FLOAT_VM_STORE_vznodecat

In this command I am sending the differences between two local snapshots to an external LV.

It looks like a simple fix regarding memory management could fix this. But I lack the knowledge for this.

When the snapshots do not differ too much, everything works like a charm!

Note: For this command to work you need to implement

https://github.com/mpalmer/lvmsync/pull/54/commits/e3089e39800d284aa33b53ba78cc8f422e969ab4

and

https://github.com/mpalmer/lvmsync/files/2066592/lvmsync_patch.txt

mpalmer commented 5 years ago

I have to say, I've never really pondered the memory usage of lvmsync -- whenever I had to run it, it was typically on super-beefy VM servers with more memory than you could shake a stick at, and the deltas were kept under control.

I can't imagine it'd be particularly difficult to identify the leak; this article which spawned this gem, should be enough to point you in the right direction. I can't really imagine I'll get the spare time to look at this myself at any point in the near future.

tasket commented 5 years ago

@richardland You might be interested in a tool I'm writing that has similarities to lvmsync. Although sparsebak (working title) is intended for incremental backups, it uses a streaming tar format as a transport to the destination... it would be pretty easy to modify the destination tar command with --to-command so the chunks are written to a block device instead of the archive.

The feasibility of success is there, as sparsebak is written in a different language (python 3) and I've taken some care to reduce memory overhead, for example by paring-down the input xml from thin_dump. My guess is that if it fails at all due to volume/diff size, it would be at the start during the xml parse stage.

OTOH, if your intended use for lvmsync is regular backups, then you could try using sparsebak as-is.

davidbartonau commented 5 years ago

@mpalmer I have noticed that you havn't updated the lvmsync in a while. I checked it out several times before deciding to build something which is aimed at continuously hot replicating LVM volumes between servers in near real time. I was surprised to see this comment from @richardland as I might have been better off working directly with him.

Anyway, just hoping for a shout out on the main page to my tool and also to the tool of @richardland to direct people who don't bother to read the issues (like me). Your tool looks way more sophisticated than mine, but the fact you do a hot sync followed by a cold sync made it unusable for me :-(