hjmangalam / parsyncfp2

MultiHost parallel rsync wrapper
Other
43 stars 6 forks source link

Clarify update scenario #1

Closed uli42 closed 1 year ago

uli42 commented 2 years ago

When moving large trees we usually run multiple rsyncs for subtrees (so partitioning is done, too, but less sophisticated than what fpart does) on live systems and repeat that until an announced downtime where no access to the source filesystem is allowed. Then the final rsyncs are done. All rsync run with --delete.

From the README it sounds to me that this is not a use case for parsyncfp2. Incremental rsync will probably work to some degree but I do not really get if/how deleted files are handled. Can you please clarify if there's a solution included to delete files on the destination that have been removed from the source between runs?

hjmangalam commented 2 years ago

Apologies for the delay - traveling with little internet. what you say is true, pfp2 (still) doesn't handle any '--delete' options due to problems sharing info among the different instances running on different systems. This may be possible to address when I add socket controls in the next iteration. However you might be able to use Ganael LaPlanche's fpsync https://www.fpart.org/#fpsync which is distributed with his fpart utility. I think it handles at least some --delete options but not in the parallel version. Give it a try .

Harry

On Tue, Mar 22, 2022 at 8:38 AM Ulrich Sibiller @.***> wrote:

When moving large trees we usually run multiple rsyncs for subtrees (so partitioning is done, too, but less sophisticated than what fpart does) on live systems and repeat that until an announced downtime where no access to the source filesystem is allowed. Then the final rsyncs are done. All rsync run with --delete.

From the README it sounds to me that this is not a use case for parsyncfp2. Incremental rsync will probably work to some degree but I do not really get if/how deleted files are handled. Can you please clarify if there's a solution included to delete files on the destination that have been removed from the source between runs?

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZ3YEIFSHW534OWK4DVBHSOFANCNFSM5RLJKVJQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Harry Mangalam

uli42 commented 2 years ago

Thanks for the pointer, I have read about fpart but missed that there's that fpsync tool. However, the man page states "Parallelizing rsync(1) makes several options not usable, such as --delete. If your source directory is live while fpsync is running, you will have to delete extra files from destination directory. This is usually done by using a final -offline- rsync(1) pass that will use this option."

In my own scripts I usually run every rsync interation with --delete to reduce the final offline sync time to a minimum.

hjmangalam commented 2 years ago

In thinking about this problem, I remembered that Ganael had written up some docs about this problem. You can find it here https://github.com/martymac/fpart/blob/master/docs/Solving_the_final_pass_challenge.txt. It implies a solution for your problem, if not yet a solution. If you write to him directly, he might be inspired to finish it off. This is a general problem and I'm toying around with a similar approach, but who knows when it will get mind-time.

Harry

On Wed, Mar 23, 2022 at 4:37 PM Ulrich Sibiller @.***> wrote:

Thanks for the pointer, I have read about fpart but missed that there's that fpsync tool. However, the man page states "Parallelizing rsync(1) makes several options not usable, such as --delete. If your source directory is live while fpsync is running, you will have to delete extra files from destination directory. This is usually done by using a final -offline- rsync(1) pass that will use this option."

In my own scripts I usually run every rsync interation with --delete to reduce the final offline sync time to a minimum.

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/1#issuecomment-1076920545, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZUHQUIEX2SAHDTYIDVBOTKPANCNFSM5RLJKVJQ . You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

uli42 commented 2 years ago

Thanks for the pointer, interesting read. But I am wondering if that gets as close as possible regarding directory mtime. AFAIK rsync will retain the mtime with -a, but how is it here? I think I will examine that approach against a plain rsync -a.

On Wed, Mar 30, 2022 at 11:53 PM Harry Mangalam @.***> wrote:

In thinking about this problem, I remembered that Ganael had written up some docs about this problem. You can find it here < https://github.com/martymac/fpart/blob/master/docs/Solving_the_final_pass_challenge.txt

. It implies a solution for your problem, if not yet a solution. If you write to him directly, he might be inspired to finish it off. This is a general problem and I'm toying around with a similar approach, but who knows when it will get mind-time.

Harry

On Wed, Mar 23, 2022 at 4:37 PM Ulrich Sibiller @.***> wrote:

Thanks for the pointer, I have read about fpart but missed that there's that fpsync tool. However, the man page states "Parallelizing rsync(1) makes several options not usable, such as --delete. If your source directory is live while fpsync is running, you will have to delete extra files from destination directory. This is usually done by using a final -offline- rsync(1) pass that will use this option."

In my own scripts I usually run every rsync interation with --delete to reduce the final offline sync time to a minimum.

— Reply to this email directly, view it on GitHub < https://github.com/hjmangalam/parsyncfp2/issues/1#issuecomment-1076920545 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AASF3YZUHQUIEX2SAHDTYIDVBOTKPANCNFSM5RLJKVJQ

. You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

— Reply to this email directly, view it on GitHub https://github.com/hjmangalam/parsyncfp2/issues/1#issuecomment-1083662220, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQHBZGJ5J4X2NJEFQIHJ53VCTEMVANCNFSM5RLJKVJQ . You are receiving this because you authored the thread.Message ID: @.***>

hjmangalam commented 1 year ago

This is as yet unsolved and will remain so until I address some other issues which are more pressing. I'm going to close this for now, tho it remains on the TODO list. hjm