yadayada / acd_cli

An unmaintained command line interface and FUSE filesystem for Amazon (Cloud) Drive
Other
1.35k stars 165 forks source link

Race condition when reading file after writing it in multi-threaded FUSE mount #427

Open jlippuner opened 8 years ago

jlippuner commented 8 years ago

When FUSE-mounting the cloud drive in multi-threaded mode, there is a race condition when reading a file immediately after writing it.

Steps to reproduce: 1) mount Amazon Cloud Drive with acd_cli mount (default is multi-threaded mode) at /mnt/acd 2) cd /mnt/acd 3) echo "this is a test" > file ; cat file

This will return nothing, because the when cat reads the newly created file, a new thread is started that reads the file, but the file is empty because the upload thread has not yet completed.

If this procedure is repeated, but -st is passed to the mount command, then it works as expected and the command echo "this is a test" > file ; cat file returns "this is a test".

I ran into this problem using s3ql on top of an acd_cli FUSE mount. It often writes a file and then opens it to read almost immediately. But it gets and empty file and panics. If I mount acd_cli in single-threaded mode, everything works as expected, but I only ever get one upload (or download) thread, which is much slower than when multiple threads are used.

So, is there a way to fix the race condition by generally using multiple threads in the acd_cli FUSE mount, but never using more than one thread for the same file? Or perhaps another solution would be to introduce a lock on an open file once data is being written to it, and while the lock exists, any read requests have to wait until the lock is released. Once the write is finished (signaled by closing the file-handle that was used to write), the lock is released. This way it could also be avoided that two different threads write to the same file at the same time (not sure whether that is a potential issue or whether a mechanism to guard against this exists already).

bgemmill commented 8 years ago

@jlippuner have a look at my branch for pr #374. I ran into similar issues with ecryptfs and implemented a local file cache that sticks around as long as there are open file handles, effectively reference counting that file.

The caveat here is in your example: echo "this is a test" > file ; cat file The reference count will go to zero at the semicolon since the first file operation finished and the second hasn't started yet.

This example may work with my branch, but you can imagine the race condition: echo "this is a test" > file & ; cat file

That said, if s3ql keeps a file open and reads/writes to it occasionally and closes it some time later, my PR is probably what you're looking for.

jlippuner commented 8 years ago

Awesome! So far s3ql is working very nicely with your PR. I'll keep trying it for a while and will let you know if I run into any issues.

mbbeaubi commented 8 years ago

@jlippuner,

It sounds like you are doing something similar to what I am considering acd_cli fuse mount + s3ql local fs.

How's it working so far? Any tips or errors you are hitting?

Do you store s3ql data + metadata on acd or just the data?

I'm considering downloading torrents directly into the s3ql mount, but I'm not sure it would work. Have you tried this?