bcpierce00 / unison

Unison file synchronizer
GNU General Public License v3.0
3.85k stars 224 forks source link

Unison may not synchronize atomic directories in the 1st pass #1047

Open filsedla opened 3 weeks ago

filsedla commented 3 weeks ago

I am seeing strange behavior of Unison when using the atomic preference for directories. Sometimes Unison shows a props <-?-> props conflict fist for such a directory and only on a subsequent run the (expected) chgd dir <-?-> chgd dir conflict when actual synchronization happens.

Steps to reproduce:

  1. Directory structure:

    ├───a
    │   └───d
    │           a.txt
    └───b
    └───d
            b.txt
  2. Unison

    unison a b -atomic "Path d"

    If Unison has not seen these paths before, it will present a props conflict. Regardless of how the user decides to resolve it, Unison continues as usual and ends with a completely normal "Synchronization complete ..." message – but the replicas remain unsynchronized.

  3. Only on a subsequent run with the same command line Unison will show a chgd dir <-?-> chgd dir conflict and depending on the user choosing left or right make the replicas synchronized.

  4. Now it's possible to re-trigger the behavior either by using the -ignorearchives parameter, deleting archives or just by renaming or otherwise changing the path to the "atomic" directory, e.g.:

    ├───a
    │   └───c
    │       └───d
    └───b           a.txt
    └───c
        └───d
                a.txt
    unison a b -atomic "Path c/d"

    Unison will now present a props conflict again on the 1st run and a chgd dir <-?-> chgd dir conflict on the 2nd pass – even when the replicas are now identical. It will then perform a completely superfluous update of one of the replicas.


The most extreme example of this behavior is to just create two directories a, b and then repeatedly run:

unison a b -ignorearchives -atomic Path 

Unison will only ever report the props <-?-> props conflict and – regardless of how the user responds to it – or what is in the directories – never make any actual synchronization between a and b.


Observed with unison 2.53.5 (ocaml 4.14.0) on Windows 7 and unison 2.52.1 (ocaml 4.13.1) on Debian 12.

tleedjarv commented 3 weeks ago

Thank you for this detailed report. If you don't mind helping out some more and have the possibility, could you also test with older versions, pre-2.52, to see if any of them works or possibly find the first version that breaks.

tleedjarv commented 3 weeks ago

It seems that this is the way the "atomic" functionality has worked since the beginning. I have a tentative fix for this but I don't use the "atomic" feature myself and have no opportunity for any meaningful testing. I will form it into a PR only if some volunteers step up to test it thoroughly.

gdt commented 3 weeks ago

@filsedla It would be helpful if you started a discussion on unison-users about what atomic is for in the first place and why you are using it. I'm not asserting you don't have good reasons, but my first reaction on hearing about some feature that's there been there and more or less unused is to question if we should just be removing the feature, as I think unison is too complicated.

bcpierce00 commented 3 weeks ago

This particular behavior does seem questionable, but more generally I can fill in the reason for the atomic preference in the first place.

I don't remember specifics (and I am not using the preference myself at the moment), but, the concern was that there are programs that keep their persistent state spread across multiple files in a directory; for such programs, it would not make sense to automatically propagate a change to one file in one direction and a change to another file in the opposite direction, any more than it makes sense to automatically merge changes to a single file on both replicas -- some kind of human or domain-specific intelligence is needed to do the merge correctly.

On Wed, Jun 12, 2024 at 7:27 AM Greg Troxel @.***> wrote:

@filsedla https://urldefense.com/v3/__https://github.com/filsedla__;!!IBzWLUs!QVezz-d-9Fa96BQjq0HV1agTqPNNDMaT-ifG8k2-tAfDxJAR1S3yt0ET6R_YTyJOsaYL3NTwxPbwZOSqw0zunz52_j3b$ It would be helpful if you started a discussion on unison-users about what atomic is for in the first place and why you are using it. I'm not asserting you don't have good reasons, but my first reaction on hearing about some feature that's there been there and more or less unused is to question if we should just be removing the feature, as I think unison is too complicated.

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/bcpierce00/unison/issues/1047*issuecomment-2162770695__;Iw!!IBzWLUs!QVezz-d-9Fa96BQjq0HV1agTqPNNDMaT-ifG8k2-tAfDxJAR1S3yt0ET6R_YTyJOsaYL3NTwxPbwZOSqw0zunxOjPa89$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABVQQC2HQWNNGDFWMOZ227TZHAWAZAVCNFSM6AAAAABJCW655CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRSG43TANRZGU__;!!IBzWLUs!QVezz-d-9Fa96BQjq0HV1agTqPNNDMaT-ifG8k2-tAfDxJAR1S3yt0ET6R_YTyJOsaYL3NTwxPbwZOSqw0zun8FpKEE1$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tleedjarv commented 2 weeks ago

My understanding has also been that atomic is intended for syncing directories like .git.

However, I think the name "atomic" is somewhat misleading (but the documentation about it is correct) in that this preference only applies when there are changes in both replicas (conflicting or not, doesn't matter). In that case, this preference forces the user to sync the entire directory from one of the replicas, not mix and match individual changes within the dir.

When there are changes in only one of the replicas then there is nothing atomic about the sync. For example, when multiple files have changed in one of the replicas then those will be presented as multiple separate changes and the user can choose to sync, not sync or reverse the direction of each change individually.

So, as @bcpierce00 actually mentions, the atomic preference is mostly only about not doing something stupid automatically.

bcpierce00 commented 2 weeks ago

I agree the name "atomic" could have been better chosen.

On Thu, Jun 13, 2024 at 11:40 AM Tõivo Leedjärv @.***> wrote:

My understanding has also been that atomic is intended for syncing directories like .git.

However, I think the name "atomic" is somewhat misleading (but the documentation about it is correct) in that this preference only applies when there are changes in both replicas (conflicting or not, doesn't matter). In that case, this preference forces the user to sync the entire directory from one of the replicas, not mix and match individual changes within the dir.

When there are changes in only one of the replicas then there is nothing atomic about the sync. For example, when multiple files have changed in one of the replicas then those will be presented as two separate changes and the user can choose to sync, not sync or reverse the direction of each change individually.

So, as @bcpierce00 https://urldefense.com/v3/__https://github.com/bcpierce00__;!!IBzWLUs!VqI4Tz7fYSFXXfrGJdwUd-JMT4kihG9YycPF3kdrWBZ_W6TsqgcG0t5onHM_z72oKxJo1zDu0aPoF2J0_kbdsWZjOHDl$ actually mentions, the atomic preference is mostly only about not doing something stupid automatically.

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/bcpierce00/unison/issues/1047*issuecomment-2166036752__;Iw!!IBzWLUs!VqI4Tz7fYSFXXfrGJdwUd-JMT4kihG9YycPF3kdrWBZ_W6TsqgcG0t5onHM_z72oKxJo1zDu0aPoF2J0_kbdsd808_i4$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABVQQC43ULCL6SPXMLMOVV3ZHG4NBAVCNFSM6AAAAABJCW655CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGAZTMNZVGI__;!!IBzWLUs!VqI4Tz7fYSFXXfrGJdwUd-JMT4kihG9YycPF3kdrWBZ_W6TsqgcG0t5onHM_z72oKxJo1zDu0aPoF2J0_kbdsRWvRvW1$ . You are receiving this because you were mentioned.Message ID: @.***>

gdt commented 2 weeks ago

It seems like a bug to lock the sync when there are changes on two sides, but to allow different choices when there is a change on only one side. The point seems to be that the directory (and everything below it, I would say) should be constrained to be in from one source or the other.

I guess it's then a fair question as to whether we should remove the option. If we didn't have it, and it were proposed, what would we say?

Another question is if anyone wants to volunteer to fix it?

bcpierce00 commented 2 weeks ago

It seems like a bug to lock the sync when there are changes on two sides, but to allow different choices when there is a change on only one side. The point seems to be that the directory (and everything below it, I would say) should be constrained to be in from one source or the other.

I agree that it would be better simply not to allow different choices. But at least the default is the right one!

Another question is if anyone wants to volunteer to fix it?

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/bcpierce00/unison/issues/1047*issuecomment-2167963241__;Iw!!IBzWLUs!TtukHlGXhefI_8BbpC4ve6G_s9wsL1CZU3y55fiPmHA92mWpfZo8sMg6YjX1bebh2doqUgtpfgJ1armw7Y-QXJs6Qw0P$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABVQQC4RZPWBDLI2PI53HF3ZHLRHBAVCNFSM6AAAAABJCW655CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRXHE3DGMRUGE__;!!IBzWLUs!TtukHlGXhefI_8BbpC4ve6G_s9wsL1CZU3y55fiPmHA92mWpfZo8sMg6YjX1bebh2doqUgtpfgJ1armw7Y-QXNJ-76bN$ . You are receiving this because you were mentioned.Message ID: @.***>

filsedla commented 2 weeks ago

I think the "atomic" feature is useful, even in its current state, and it's better than if it didn't exist at all.

I have been using Unison to synchronize data across two home computers and a laptop. Among the data, there are some .git directories, mariadb/data directory, Firefox profile and some other program settings. For these items I am using the "atomic" keyword and it works. These are examples of data where combining individual files from both replicas could be disastrous, whereas the Unison's behavior I reported is just an annoyance (if you know about it).

So for now I just want to point out, for someone who perhaps finds the "atomic" keyword in the manual, that even though it looks like a cool feature that looks like it's going to enhance Unison's functionality, it actually degrades some of Unisons otherwise perfect handling of situations like when the replica were changed – but in an identical way –, or it fails to synchronize the directory until Unison has been run multiple times.

tleedjarv commented 2 weeks ago

Another question is if anyone wants to volunteer to fix it?

I have a patch for fixing the issue in this specific ticket (not changing the semantics of "atomic"). But this patch requires testing not only "atomic" but regular syncs as well because I deduplicated the code for regular dirs and atomic dirs. I will not create a PR with this patch unless we know we can get substantial testing done.