hpc / mpifileutils

File utilities designed for scalability and performance.
https://hpc.github.io/mpifileutils
BSD 3-Clause "New" or "Revised" License
168 stars 66 forks source link

[Request] dsync --write-verify option? #581

Open markmoe19 opened 1 month ago

markmoe19 commented 1 month ago

Hi,

We are looking at ways to increase the performance of dsync -c (content) compare. As you would imagine, this can be very time-consuming but at the same time users want this "to be sure" that final sync has an exact copy of their data.

We can do a lot of syncs before final-sync and cut-over to a new file system. However, the dsync -c right now is all or nothing and that takes a long time then for a final-sync for cut-over and that final-sync is when users are "down" so they don't make any changes.

If dsync had an option somewhere between -c (content) and -lite (like dcmp), then it could be ideal when for when doing a series of dsync before final-sync.

This the -lite-content option compares size, mtime and atime (assuming --open-noatime is used as well) such that if all 3 match then content is not compared. But if one is different (even if just atime) the full content is compared.

Thank you for these great utilities!

markmoe19 commented 1 month ago

After a new (or changed) file is written to the target then it has to be re-read to compare content and make sure it was written correctly. This can possible be done with checksums. In this way, each subsequent run builds on the successful compared content of the previous run and no need to re-compare (unless size, mtime or atime changed). Thanks

markmoe19 commented 1 month ago

Talking about this internal to our company, I think what we are looking for is to replace -c (byte-compare) with --write-verify option.

When dsync writes a new file (because the target file does not exist or because the source file changed since last written), then after dsync completes the write we want it to verify the write. This could be using byte-compare or a checksum method (similar to rsync --checksum option). Potentially the checksum for the source side could be calculated as part of normal reading of source side file. The checksum of the target side would require an additional read of the recently written target file.

The end goal is for each subsequent dsync run of the same source and target to build with confidence on the previous run knowing that each written file was verified. But, if file time and size match then no need for a new write of target file to occur.

adilger commented 1 month ago

One potential issue of doing the read verification immediately after write is that this may miss issues if the file is only in cache on a client node, and not saved persistently (or correctly) to storage on the server node.

At a minimum this should need to do the read verification on another node (like IOR does).

markmoe19 commented 1 month ago

maybe the read part of the write verification could be done at the end after issuing a drop caches commands or, like you mention, on a node different than the write file node (assuming dsync was running on more than 1 nodes that should be possible), thanks