bcpierce00 / unison

Unison file synchronizer
GNU General Public License v3.0
4.01k stars 227 forks source link

Add workaround for macOS 12 not allowing rename(2) on 0500 mode directory (which is legal but strange and unhelpful) #662

Open liyishuai opened 2 years ago

liyishuai commented 2 years ago

Scenario

  1. Replicas A and B are initially empty directories.
  2. Create subdirectory bar in replica A.
  3. Run unison A B -batch -confirmbigdel=false
  4. Expect subdirectory bar under replica B, but observed .unison.bar.e3b0e1717485864e9d8041ffd5be620a.unison.tmp.

    new dir ----> bar
    A : new dir modified on 2022-03-06 at 0:36:34 size 0 r-x---r-x B : absent Failed [bar]: Error in renaming /private/var/folders/d5/f_vqb7sj4cn26rl4cyfxd6f00000gn/T/unison/1646544993.55/B/.unison.bar.d866df244bf6ec3d09ec15d394774903.unison.tmp to /private/var/folders/d5/f_vqb7sj4cn26rl4cyfxd6f00000gn/T/unison/1646544993.55/B/bar: Permission denied [rename(/private/var/folders/d5/f_vqb7sj4cn26rl4cyfxd6f00000gn/T/unison/1646544993.55/B/.unison.bar.d866df244bf6ec3d09ec15d394774903.unison.tmp)]

Environment

How to replicate

I revealed this behavior with a randomized tester, which launches Unison very frequently.

  1. Clone https://github.com/liyishuai/file-sync.git
  2. Add OPAM repository for Coq:
    opam repo add coq-released https://coq.inria.fr/opam/released
    opam repo add coq-extra-dev https://coq.inria.fr/opam/extra-dev
  3. Under project directory, run opam install . This installs an executable FileTest.native.
  4. Run FileTest.native. You'll see plenty logs, but only need to focus on the final few lines, which is a minimal counterexample that looks like:

    ERR: Upon (2 (ls ())), expect ("bar"), but observed (".unison.bar.e3b0e1717485864e9d8041ffd5be620a.unison.tmp")
    DIR: /var/folders/d5/f_vqb7sj4cn26rl4cyfxd6f00000gn/T/unison/1646545178.87
    <<<<< rejecting trace >>>>>
    ((51 ((target 1) (method mkdir) (path (bar)))) (51 true) (81 sync) (81 true) (83 ((target 2) (method ls) (path ()))) (83 (.unison.bar.e3b0e1717485864e9d8041ffd5be620a.unison.tmp)))
    <<<<< shrink exhausted >>>>
    Failure

    DIR: shows the temporary directory where the tests were run. If you run ls -ax * under that directory, you'll see:

    error.txt logs.txt
    
    A:
    bar
    
    B:
    .unison.bar.e3b0e1717485864e9d8041ffd5be620a.unison.tmp

    Here logs.txt and error.txt recorded the stdout and stderr of Unison.

Feel free to reach me if you need further hints on replicating this issue.

tleedjarv commented 2 years ago

Since the error is "permission denied", have you verified that it really is not a permissions issue on the parent dir? Could you post the output of something like ls -ld on the parent dirs?

bcpierce00 commented 2 years ago

Very interesting…. So it sounds like there are actually two issues here, if I’ve understood the bug report correctly:

1) something causes the rename to fail, and then 2) unison nevertheless reports successful termination.

Is that right?

   - Benjamin

On Mar 6, 2022, at 2:15 AM, Tõivo Leedjärv @.***> wrote:

 Since the error is "permission denied", have you verified that it really is not a permissions issue on the parent dir? Could you post the output of something like ls -ld on the parent dirs?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.

gdt commented 2 years ago

The key question is if the syscalls unison makes are correct. Basically, is the rename call wrong and it's a unison bug, vs is it correct and its a mac bug.

I'm really not clear on exit status, but @liyishuai are you saying that when this happens, is unison giving status that reports success?

When you run this, how often does it fail? Is this one run of unison at a time, with the next not started until the first one has exited?

liyishuai commented 2 years ago

Here are my findings:

bcpierce00 commented 2 years ago

So if unison is returning a nonzero error code, isn’t this a correct outcome? Except for the temp directory, nothing changed on either side, if I’ve understood correctly.

   - Benjamin

On Mar 6, 2022, at 3:09 PM, Yishuai Li @.***> wrote:

 Here are my findings:

The parent directory /private/var/folders/d5/f_vqb7sj4cn26rl4cyfxd6f00000gn/T/unison/1646544993.55/ has permission 755. This failure occurs only during automated testing, where unison is called very frequently. The automated tester calls unison with OCaml's Sys.command function. The tester waits for the Sys.command to return before calling it again. Upon failure, unison returns 2. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.

liyishuai commented 2 years ago

I've managed to reproduce this issue by hand:

mkdir -m 700 A B
mkdir -m 500 A/foo
unison A B -batch -confirmbigdel=false
Unison 2.51.5 (ocaml 4.12.0): Contacting server...
Looking for changes
Reconciling changes
new dir  ---->            foo  
A            : new dir            modified on 2022-03-06 at 17:34:59  size 0         r-x------
B            : absent
Propagating updates
Unison 2.51.5 (ocaml 4.12.0) started propagating changes at 17:35:06.92 on 06 Mar 2022
[BGN] Copying foo from /private/tmp/A to /private/tmp/B
Failed: Error in renaming /private/tmp/B/.unison.foo.66d31d8e46b48749b7a9401bb0979145.unison.tmp to /private/tmp/B/foo:
Permission denied [rename(/private/tmp/B/.unison.foo.66d31d8e46b48749b7a9401bb0979145.unison.tmp)]
Failed [foo]: Error in renaming /private/tmp/B/.unison.foo.66d31d8e46b48749b7a9401bb0979145.unison.tmp to /private/tmp/B/foo:
Permission denied [rename(/private/tmp/B/.unison.foo.66d31d8e46b48749b7a9401bb0979145.unison.tmp)]
Unison 2.51.5 (ocaml 4.12.0) finished propagating changes at 17:35:06.92 on 06 Mar 2022, 0.002 s
Saving synchronizer state
Synchronization incomplete at 17:35:06  (0 items transferred, 0 skipped, 1 failed)
  failed: foo

However, cp -R A/foo B works fine.

gdt commented 2 years ago

The repro recipe above passes on NetBSD 9 amd64 with unison 2.51.90 and ocaml 11.

I wonder if on the mac one does

mkdir -m 700 foo
mv foo bar

what happens. Probably the real question is the rename syscall. I wonder if you can run unison under dtrace to see the system calls precisely, and or with debugging.

liyishuai commented 2 years ago
mkdir -m 700 foo
mv foo bar

Renames foo to bar with permission 700.

liyishuai commented 2 years ago
mkdir -m 700 A B
mkdir -m 500 A/foo
unison A B -batch -confirmbigdel=false

This script worked fine on:

Might be a macOS-specific issue.

gdt commented 2 years ago

Thanks. I dropped the feedback label. However, I think we still need someone to run dtrace and/or isolate the specific system call and the previous history so we can understand if this is a failure of macOS to follow POSIX or a failure of unison to only rely on what POSIX specifies, and then we will need someone to provide a patch/PR.

It is interesting that there have been no other reports of this, but perhaps it is new in macOS 12.

I would be interested to hear from others if the repro fails for anyone on any non-macOS POSIX-ish system, and to hear reports either way about macOS.

tleedjarv commented 2 years ago

I've managed to reproduce this issue by hand:

mkdir -m 700 A B
mkdir -m 500 A/foo
unison A B -batch -confirmbigdel=false

I had overlooked the fact that bar was a directory. Then it is not a permissions issue with the parent directory but the renamed directory itself.

If you set the permissions of A/foo to 500 then it can't be renamed.

tleedjarv commented 2 years ago

if this is a failure of macOS to follow POSIX

I don't think that's the case. Check https://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html

Really, it all checks out. POSIX specifies (emphasis mine) "write access permission may be required for the directory named by old". By my reading, it is correct for this scenario to fail and it is correct for this scenario to pass.

gdt commented 2 years ago

Interesting; thanks for finding that. In general POSIX has a bunch of odd-case may statements because it is the codification of much existing behavior. It strikes me as odd to require write permission for the dir being renamed, and out of line with much other unix. I was able to reproduce on a macOS 12 machine with just "mv" on a dir with 0500.

What we don't really know is if this problems is 100% about unison just calling rename(3) (which is rename(2), but that so far seems likely. We also don't know if it's just macOS 12, or also earlier, and if macOS documents this. (The mac I have access too unhelpfully lacks syscall man pages.)

The next question is what the right workaround is, and if there is prior art for other programs with the same problem.

I've retitled assuming that this ticket is now about creating a workaround.

tleedjarv commented 2 years ago

If someone could test with FreeBSD, that would be interesting. FreeBSD man page states (again, emphasis mine) "[EACCES] The directory pointed at by the from argument denies write permission, and the operation would move it to another parent directory." So, renaming within the same parent should not require write permissions on the renamed dir (my interpretation).

I am leaning towards just closing this issue.

How common is it to have a read-only dir and can it be synced both ways anyway? Isn't this setup not-fit-for-syncing to begin with? You could theoretically do one-way mirroring from a read-only source, that is conceivable... Or am I overlooking something?

gdt commented 2 years ago

I am inclined to leave it open, because syncing unwritable files and directories seems like a sensible thing to do.

tleedjarv commented 2 years ago

Seems like this issue was well known years ago. Here's an extract from the TODO list: https://github.com/bcpierce00/unison/blob/243f18336a72c54738412b59345972e8c5191c9c/src/TODO.txt#L57-L64

And related requests https://github.com/bcpierce00/unison/blob/243f18336a72c54738412b59345972e8c5191c9c/src/TODO.txt#L66-L73