apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.25k stars 1.03k forks source link

[windows] rename(2) is not atomic #4459

Open janl opened 1 year ago

janl commented 1 year ago

This is a proactive bug report and not an issue right now.

There have been considerations and experiments moving our Windows builds to WSL2. In case that happens, we need to be aware that rename(2) is not atomic. We rely on it being atomic for at least compaction and deletion.

nickva commented 1 year ago

We go through https://www.erlang.org/doc/man/file.html#rename-2 Erlang's wrapper and we rely on POSIX rename atomicity guarantees during compaction file swapping. However, the Erlang API doesn't mention it provides any atomicity. We know on unix-y systems it would patch to the rename(2) system call, as evidenced here. And then, of course, it would matter that the file system itself has POSIX semantics.

For Windows it's not clear if atomicity holds based on win_prim_file.c

           /* This is pretty iffy; the public documentation says that the
             * operation may EACCES on some systems when either file is open,
             * which gives us room to use MOVEFILE_REPLACE_EXISTING and be done
             * with it, but the old implementation simulated Unix semantics and
             * there's a lot of code that relies on that.
             *
             * The simulation renames the destination to a scratch name to get
             * around the fact that it's impossible to open (and by extension
             * rename) a file that's been deleted while open. It has a few
             * drawbacks though;
             *
             * 1) It's not atomic as there's a small window where there's no
             *    file at all on the destination path.
             * 2) It will confuse applications that subscribe to folder
             *    changes.
             * 3) It will fail if we lack general permission to write in the
             *    same folder. */

So at least as far as all of Windows usage is concerned for us we probably should not recommend or encourage production usage of it, just like we don't for network mounted file system. It's great for testing and trying things out though, of course.

nickva commented 1 year ago

Another interesting thing in regards to WSL2 is that Windows files are effectively accessed as a remote file system via the 9P protocol. That might explain toot.cat user's ENOENT error, as they described it a bit further down. It's one of the signs of users trying to use NFS (a remote FS) as a POSIX file system as seen here

In a weird twist of fate, it may be that it's actually safer to use WSL2 with Linux-only (in VM) files than using Windows or WSL2 and mounting Windows paths into the VM.