datto / dattobd

kernel module for taking block-level snapshots and incremental backups of Linux block devices
GNU General Public License v2.0
560 stars 120 forks source link

COW datastore tracked by CBT when transition-to-snapshot #364

Open m-garrido opened 3 months ago

m-garrido commented 3 months ago

Hello,

First at all, thanks for you work. Your solution runs very well. My aim goal is to track all block changes with the CBT feature. For that, i use "transition-to-incremental". Without any changes, when i "transition-to-snapshot", i get 234340 block changes. It's huge considering the change rate near to 0. After some research, i guest it's coming from cow datastore which size is 10% of the device by default. For my case, a size of 957349888. So, 234340 * 4096 = 959856640 seems to be near of my cow file size. For this reason, i would like to know if there is a way to ignore those block changes ? The cow datastore is ephemeral and in my opinion, should be ignored. Without this, each time we use update-img, we'll synchronize at least the cow datastore.

Thanks in advance.

Swistusmen commented 3 months ago

Hi, thanks for kind words. Sorry, it's first time I'm hearing about CBT, do you mean VMWare CBT (Changed block tracking)? Can you please write down exact steps to reproduce everything? Generally snapshot mode as you mentioned needs cow datastore which is by default 10% of the device and it is neccessary, do i understand correctly that you don't want to consider this blocks? I am afraid this question reffers more to the CBT itself rather than us, but maybe Im wrong, so please let me understand

m-garrido commented 3 months ago

You're right, CBT is known for VMware CBT feature. For your solution, i think you use the same thing as CBT in "incremental mode" because you track all changed blocks as you're documentation mentions:

In incremental mode, by contrast, only the COW index is kept on disk. This frees up disk space for the rest of the filesystem to use. Writes are tracked in-memory and periodically synced to the index file as needed. During this mode, the driver does not present a readable snapshot device and simply tracks which blocks have changed in the COW index

As i understand and i've already tested, all changed blocks are tracked and a flag is set for each disk sector where a block is altered in 'index' COW file.

What i've do is this:

  1. create snapshot for /dev/sda1 => enable snapshot for my device
  2. transition to incremental into cow file /.datto0 => track changed blocks enabled
  3. dd if=/dev/random of=/root/test bs=1M count=100 oflag=direct => Create a file of 100Mb
  4. transistion to snapshot into cow file /.datto1 => track changed blocks disabled and cow datastore created in /.datto1
  5. read blocks changed from /.datto0

At this stage, when i read /.datto0 cow index, i find more than 200.000 sectors changed. My new file of 100Mb should only have changed 25.600 (100Mb 1024 1024 / 4096) sectors on my device. So, i would like to understand why i have so many sectors changes without modifying the same quantity of data. Therefore, i suppose it comes from the /.datto1 which is in "snapshot" mode and contains a COW datastore consuming 10% of the entire snapshoted device. This file is created and tracked for blocks changed before switching from "incremental" to "snapshot" mode. Thoses blocks changed should not be reported by /.datto0. I expect to have only blocks changed by my new file created by "dd". So, only 25600 sectors and not > 200k sectors. I hope thoses explanations will be clearer.