Zumastor as a rsync replacement requires wasting of space

Here comes the chatter jiaying, dan kegel and I have just had:

[20:42] <pgquiles_> mmm using zumastor as a rsync replacement is not
working for me: initial replication works fine, the next one is never mounted
[20:42] <pgquiles_> root@dubna:~# zumastor status --usage
[20:42] <pgquiles_> VOLUME zumabooks:
[20:42] <pgquiles_> Status: running
[20:42] <pgquiles_> Snapshot store block size = 16384; 219 of 256 chunks free
[20:42] <pgquiles_> Origin size: 42,949,672,960 bytes
[20:42] <pgquiles_> Write density: 0
[20:42] <pgquiles_> Creation time: Wed Apr 16 17:34:27 2008
[20:42] <pgquiles_>   Snap            Creation time Usecnt Prio   Chunks
Unshared   Shared
[20:42] <pgquiles_>      2 Wed Apr 16 19:08:32 2008      0    0        !  
     !        !
[20:42] <pgquiles_>      3 Wed Apr 16 19:08:32 2008      0    0        !  
     !        !
[20:42] <pgquiles_> totals                                             0  
     0        0
[20:43] <pgquiles_> that's with r1523
[20:43] <pgquiles_> and no 'define schedule' on the original machine, as I
don't need old snapshots on that volume, just replication
[21:00] <pgquiles> oh, and a snapshot store of exactly 1 LVM extent (4MB in
my case), which was the smallest I can make it
[21:01] <willn> Yea
[21:10] <jiayingz> pgquiles_, how large is ur snapshot store
[21:11] <jiayingz> looks like the snapshots got squashed
[21:11] <jiayingz> that is why they were not mounted
[21:17] <pgquiles> jiayingz: indeed, the logs said zumastor was trying to
mount an already squashed snapshot
[21:17] <pgquiles> jiayingz: so, what should be the snapshot store size if
I only want replication?
[21:19] <jiayingz> pgquiles, on downstream you still need a snapshot store
that is large enough to hold all the replicated data in each cycle
[21:19] <jiayingz> this depends on the workload
[21:20] <jiayingz> and replication cycle
[21:20] <pgquiles> how can I make an estimation?
[21:21] <jiayingz> e.g., if you use 1 hour replication cycle, the snapshot
store can be as small as write_bandwidth * 1 hour
[21:21] <jiayingz> write bandwidth is the write throughput on upstream
[21:22] <pgquiles> mmm so the longer the replication cycle, the bigger the
snapshot store
[21:22] <jiayingz> because that is maximum delta size sent to the downstream
[21:22] <pgquiles> I was planning on a 1 day replication cycle :-/
[21:23] <jiayingz> then depends on if you are going to make a lot of writes
in a day
[21:23] <jiayingz> if the workload is always read dominant, a small
snapshot store size can also work
[21:24] <pgquiles> it's read dominant but sometimes we may write a huge
amount of data (like 20 GB) in a day in one of the volumes
[21:24] <jiayingz> but if you may overwrite the whole volume some time, it
is safe to set snapshot store as large as the origin volume
[21:25] <pgquiles> that constraint effectively renders zumastor useless as
a rsync replacement :-(
[21:26] <dkegel> that's the cost of having atomic update - the data has to
go somewhere until you flip the switch
[21:26] <jiayingz> to be exactly a rsync replacement, we need to export
origin volume instead of snapshot volume on downstream
[21:26] <dkegel> The question is: are you ok with your filesystem being
inconsistent during update?  If so, rsync is ok.
[21:26] <jiayingz> dkegel, exactly
[21:29] <jiayingz> pgquiles, i got an idea. now we have an option to limit
replication bandwidth.
[21:30] <jiayingz> if you limit the replication bandwidth to 1K. the
maximum data that can be replicated a day is 86400K
[21:30] <jiayingz> well, that is 86.4M
[21:31] <pgquiles> problem is that may mean data is not replicated
[21:31] <jiayingz> hmm, wait. that doesn't work. a replication cycle still
needs to finish before the next one starts
[21:32] <pgquiles> I want all my new data replicated but with a zero-sized
(or at least really tiny) snapshot store
[21:33] <jiayingz> but as dkegel said, the extra space is the cost you pay
for holding old data before switching to the new version
[21:33] <pgquiles> maybe the right way to do it is to introduce a new
parameter to 'zumastor replicate', something like 'zumastor replicate
--no-snapshots' which exports the origin volume instead of the snapshot
volume to downstream
[21:34] <jiayingz> you can use rsync for that purpose, but you may get
inconsistent data during the replication
[21:35] <pgquiles> I was trying to avoid having to set up both zumastor and
rsync on the same server, zumastor for some volumes, rsync for others
[21:35] <pgquiles> but if I have to choose between wasting that much space
and setting up rsync, I'm going with rsync for those volumes
[21:36] <dkegel> pgquiles, are you ok with the volume being inconsistent
during replication?  That's the price of using rsync (or this hack Jiaying
is proposing)
[21:36] <jiayingz> what is rsync used for? just for backup?
[21:36] <pgquiles> jiayingz: yes, just for having data replicated to
several locations. We do not need snapshots for frozen software, or for
third party software.
[21:37] <dkegel> Do you have some sort of policy in place, e.g. write-once,
that keeps inconsistency from being a problem?
[21:37] <pgquiles> dkegel: I'd like it better if data was always consistent
on downstream but given that those servers are there just for backup,
inconsistency does not worry me
[21:38] <dkegel> OK, then the option Jiaying proposes is worth looking at.
 Thanks for the nudge.
[21:39] <pgquiles> dkegel: yes, we have something like that: we have
several servers and we always write and read to one of them. If that server
goes down, another server takes its place and users do not realize they are
accessing a different server (they access it with the very same name the
now-broken server had)
[21:39] <jiayingz> pgquiles, for record, could you file an issue for this?
[21:39] <pgquiles> jiayingz: sure
Original issue reported on code.google.com by pgqui...@gmail.com on 16 Apr 2008 at 7:41
junneyang / zumastor

Zumastor as a rsync replacement requires wasting of space #112