bcpierce00 / unison

Unison file synchronizer
GNU General Public License v3.0
4.1k stars 231 forks source link

Undetected write-delete conflict #664

Closed liyishuai closed 2 years ago

liyishuai commented 2 years ago

Scenario

When one replica modifies a file and another replica deletes it, Unison should return 1 which indicates a conflict.

mkdir A B
echo old > A/file.txt
unison A B -batch -confirmbigdel=false > logs.txt 2> error.txt
echo new > B/file.txt
rm A/file.txt
unison A B -batch -confirmbigdel=false >> logs.txt 2>> error.txt
RET=$?
[ $RET == 1 ] || echo Unison returned $RET

Unison returned 2

cat logs.txt
new file ---->            file.txt  
A            : new file           modified on 2022-03-06 at 21:12:28  size 4         rw-r--r--
B            : absent
deleted  ---->            file.txt  
A            : deleted
B            : unchanged file     modified on 2022-03-06 at 21:12:28  size 4         rw-r--r--
Failed [file.txt]: Destination updated during synchronization
The contents of file file.txt have been modified
  (if this happens repeatedly on a file that has not been changed, 
  try running once with 'fastcheck' set to false)

The confusing point is that Unison could detect A/file.txt in the first run, but couldn't detect changes to B/file.txt in the second run.

Environments

cat logs.txt
new file ---->            file.txt  
A            : new file           modified on 2022-03-06 at 21:32:28  size 4         rw-r--r--
B            : absent
deleted  <-?-> changed    file.txt  
A            : deleted
B            : changed file       modified on 2022-03-06 at 21:32:28  size 4         rw-r--r--
No updates to propagate
liyishuai commented 2 years ago

Just tested it on Linux machines, which shows different results:

Failing environment

liyishuai commented 2 years ago

Update: on openSUSE, the script sometimes succeeds in detecting the conflict, but in other cases complains in a different way:

rm: cannot remove 'A/file.txt': No such file or directory Unison returned 0

cat logs.txt
new file ---->            file.txt  
A            : new file           modified on 2022-03-06 at 21:32:28  size 4         rw-r--r--
B            : absent
deleted  <-?-> changed    file.txt  
A            : deleted
B            : changed file       modified on 2022-03-06 at 21:32:28  size 4         rw-r--r--
No updates to propagate
tleedjarv commented 2 years ago

Can you run with -fastcheck=false and see if it makes a difference?

Please provide the output of -debug update+.

liyishuai commented 2 years ago

With -fastcheck=false, the complaints disappeared i.e. Unison can always detect the conflict.

With -debug update+ and without -fastcheck=false, the stderr logs in the 1st and 2nd Unison calls are:

  1. error1.txt (Ubuntu), error1.txt (macOS)
  2. error2.txt (Ubuntu), error2.txt (macOS)
gdt commented 2 years ago

I would have to page things in, but it seems either:

bcpierce00 commented 2 years ago

fastcheck is set to true by default on nix systems, false otherwise. It is a little unsafe (e.g., because users can manually set file modtimes into the past), but only in the sense that it may lead to puzzling failures (as above), not to loss of data (as above). And it improves performance enormously*.

I wonder if there is a race condition here where the second run of Unison is seeing a filesystem with updated file contents but not an updated timestamp...?

- B

On Mon, Mar 7, 2022 at 4:52 PM Greg Troxel @.***> wrote:

I would have to page things in, but it seems either:

  • fastcheck=true is an unsafe heuristic, in which case it should be default off
  • fastcheck=true is sound, in which case if it's better I don't know why we have an option
  • fastcheck does default to false, but that's not noted above

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/bcpierce00/unison/issues/664*issuecomment-1061173321__;Iw!!IBzWLUs!EQ7lCjpwOuVttvv3q9KyuNkY3ZtJy7YPvoiSEIm1KrXycmuH_wnVB5c0s81_e2phYbw$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABVQQC4GPGHLHZ6XN4AUMBDU6Z275ANCNFSM5QB7C6AA__;!!IBzWLUs!EQ7lCjpwOuVttvv3q9KyuNkY3ZtJy7YPvoiSEIm1KrXycmuH_wnVB5c0s81_rIS_MZM$ . Triage notifications on the go with GitHub Mobile for iOS https://urldefense.com/v3/__https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!IBzWLUs!EQ7lCjpwOuVttvv3q9KyuNkY3ZtJy7YPvoiSEIm1KrXycmuH_wnVB5c0s81_gnzXsYg$ or Android https://urldefense.com/v3/__https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!IBzWLUs!EQ7lCjpwOuVttvv3q9KyuNkY3ZtJy7YPvoiSEIm1KrXycmuH_wnVB5c0s81_2kQTIQo$.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

tleedjarv commented 2 years ago

Thank you @liyishuai. Could you also post the output of stat like this:

unison A B -batch -confirmbigdel=false > logs.txt 2> error.txt
stat B/file.txt                     <-------- stat here
echo new > B/file.txt
stat B/file.txt                     <-------- stat here

And while you're at it, could you see if unison A B -dumparchives produces archive dumps in the .unison directory. Those would be useful to see as well.

tleedjarv commented 2 years ago

I wonder if there is a race condition here where the second run of Unison is seeing a filesystem with updated file contents but not an updated timestamp...?

Yes, this is exactly what's happening according to the logs provided by @liyishuai. I want to see the output of stat, too, to see what the OS actually records/reports.

gdt commented 2 years ago

And dtrace is probably highly useful here, to record what unison actually sees, rather than other syscalls that in theory show that.

(My read is that so far it seems this might be a mac bug, or at least not a unison bug, but it's really hard to tell what's going on exactly, so I'm in a mode of waiting for more info.)

liyishuai commented 2 years ago

stats on macOS:

16777220 12967557733 -rw-r--r-- 1 yishuai wheel 0 4 "Mar  8 14:13:09 2022" "Mar  8 14:13:09 2022" "Mar  8 14:13:09 2022" "Mar  8 14:13:09 2022" 4096 8 0 B/file.txt
16777220 12967557733 -rw-r--r-- 1 yishuai wheel 0 4 "Mar  8 14:13:09 2022" "Mar  8 14:13:09 2022" "Mar  8 14:13:09 2022" "Mar  8 14:13:09 2022" 4096 8 0 B/file.txt
liyishuai commented 2 years ago

stats on Ubuntu:

  File: B/file.txt
  Size: 4           Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d  Inode: 27394068    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/ yishuai)   Gid: ( 1000/ yishuai)
Access: 2022-03-08 14:14:26.718290068 -0500
Modify: 2022-03-08 14:14:26.718290068 -0500
Change: 2022-03-08 14:14:26.718290068 -0500
 Birth: -
  File: B/file.txt
  Size: 4           Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d  Inode: 27394068    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/ yishuai)   Gid: ( 1000/ yishuai)
Access: 2022-03-08 14:14:26.718290068 -0500
Modify: 2022-03-08 14:14:26.718290068 -0500
Change: 2022-03-08 14:14:26.718290068 -0500
 Birth: -
tleedjarv commented 2 years ago

So, here's what happens. Unison sees that the file size, inode and mtime are all unchanged, so it assumes the file is unchanged (with fastcheck=true). With fastcheck=false, it will scan file contents regardless.

The question is, why does the mtime not change? What do you see when you do:

stat B/file.txt
touch B/file.txt
stat B/file.txt
liyishuai commented 2 years ago

I've added -dumparchives flag and ran the script on a clean .unison directory. The folders are dumped here: Unison (macOS).zip .unison (Ubuntu).zip


stats before and after touch produce different output:

  File: B/file.txt
  Size: 4           Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d  Inode: 27394489    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/ yishuai)   Gid: ( 1000/ yishuai)
Access: 2022-03-08 14:27:26.864535470 -0500
Modify: 2022-03-08 14:27:26.864535470 -0500
Change: 2022-03-08 14:27:26.864535470 -0500
 Birth: -
  File: B/file.txt
  Size: 4           Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d  Inode: 27394489    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/ yishuai)   Gid: ( 1000/ yishuai)
Access: 2022-03-08 14:27:38.080566751 -0500
Modify: 2022-03-08 14:27:38.080566751 -0500
Change: 2022-03-08 14:27:38.080566751 -0500
 Birth: -

When I substitute touch B/file.txt with echo blah > B/file.txt, "Access" remains unchanged, while "Modify" and "Change" are updated.

tleedjarv commented 2 years ago

Thank you for the information. Unfortunately, dumparchives does not seem to include mtime information if times has not been set to true.

Given all the information so far, Unison is working properly.

Are you saying that "Modify" changes when you do echo blah > B/file.txt but it doesn't change when you do echo new > B/file.txt?

liyishuai commented 2 years ago

Are you saying that "Modify" changes when you do echo blah > B/file.txt but it doesn't change when you do echo new > B/file.txt?

No, echo anything > B/file.txt changes "Modify" field if the script only contains "stat, echo, stat". When running it between Unison calls, the "Access, Modify, Change" fields remain the same.

gdt commented 2 years ago

if you put sleep 2 before the echo to change the file, does that change anything? I am wondering if macOS is caching the time lookup. But it does point out that if the computer is faster than the time resolution, we can end up with different contents same time. It would have been very easy back on Sixth Edition with just seconds (IIRC; it's been a long time!).

liyishuai commented 2 years ago

Yes adding sleep 2 after calling Unison and before echo new > B/file.txt makes stat produce different Modify and Change time for B/file.txt.

gdt commented 2 years ago

I guess then:

So, if anything, this ticket could turn into:

Does that sound right to you?

liyishuai commented 2 years ago

Yes it makes sense to document this timing issue.

gdt commented 2 years ago

The manual already says that fastcheck=true will fail to perceive changes in files that have contents changes witthout mtimes changes, so I think it says what it needs to. I am not aware of any other reports like yours, and scripted testing seems likely to trigger the issue. I am reluctant to add text to the already-big manual to emphasize something that is inferrable.

Thanks for the repro recipe and for additional data. Probably you should either turn fastcheck off or add sleep calls for automated tests, and please feel free to report anything failing. Even if you didn't find a bug this time, I found it useful.

liyishuai commented 2 years ago

Revisiting this case, the logs drew my interest:

new file ----> file.txt
A : new file modified on 2022-03-06 at 21:12:28 size 4 rw-r--r-- B : absent deleted ----> file.txt
A : deleted B : unchanged file modified on 2022-03-06 at 21:12:28 size 4 rw-r--r-- Failed [file.txt]: Destination updated during synchronization The contents of file file.txt have been modified (if this happens repeatedly on a file that has not been changed, try running once with 'fastcheck' set to false)

Unison did not detect the modification when determining the updates to propagate, so how did it notice that the contents were modified when propagating the delete?

tleedjarv commented 2 years ago

See comments above about the fastcheck preference and explanation in the manual.

liyishuai commented 2 years ago

https://github.com/bcpierce00/unison/blob/3c314bfe7384babd0ffb99dc4b38191a13f9a6b5/doc/unison-manual.tex#L2305-L2310 Found that in the manual. Thanks!