winfsp / cgofuse

Cross-platform FUSE library for Go - Works on Windows, macOS, Linux, FreeBSD, NetBSD, OpenBSD
https://winfsp.dev
MIT License
511 stars 82 forks source link

Incorrect writes from fuse #70

Closed aloknerurkar closed 1 year ago

aloknerurkar commented 1 year ago

I created a fuse filesystem using the cgofuse lib. The repo is here. The fuse implementation is almost similar to the MemFs reference implementation provided in this project. The only difference is instead of storing files in mem, they are pushed to some storage.

There are some tests in the package. I am currently running this on an M1 Mac mini PC. What I observe is, if we do smaller writes on files mounted on fuse, we get duplicate write ops.

For eg, in the test writes are done in 1024 byte lengths, I see the following ops:

write off 4702208 len 1024
write off 4702208 len 2048
write off 4702208 len 3096
....
....
write off 4702208 len 10240

Then at some point in time, I get a write op which writes 0s to the first 64 bytes of the file

write off 0 len 65536 [0 0 0 0 0 0 0 0 0 0]

This doesnt happen always, the test fails like 2/5 times. Also, things are better if I run the test on my Macbook Pro M1Pro machine.

I have been debugging this in my code for a couple of weeks and I have added more tests around things that I had doubts about, but eventually, I have concluded that this is happening from the fuse end.

Is there any known issue around this?

billziss-gh commented 1 year ago

Is the problem that data is written twice over the same range or is the problem that erroneous data are being written?

If the former, this is legal. The OS does not provide any guarantees about the order that writes will arrive at your file system.

If you want to have better control over the writes you must use open/O_DIRECT on Linux and fnctl/F_NOCACHE on OSX.

aloknerurkar commented 1 year ago

The former is happening although this is not the problem. I just wanted to make sure this is legal.

The problem happens when I get the write for 0s. And it's almost always offset 0-64k. You should be able to reproduce it if you run the test with count 5/10.

billziss-gh commented 1 year ago

Yes, this is legal. You should not expect any particular order for writes when doing cached I/O.

aloknerurkar commented 1 year ago

Yes, this is legal. You should not expect any particular order for writes when doing cached I/O.

So this is not a problem. The write for 0s for the first 64k bytes is the main problem. It's always the 0-64k offsets as i mentioned. Any idea why this could happen?

billziss-gh commented 1 year ago

The write for 0s for the first 64k bytes is the main problem.

Is your test/application writing something other than zeroes in the range 0-64K and the OS sends you zeroes instead?

Or is your test/application writing at offset 4702208 onward, thus creating a hole from 0-4702208 (which conceptually contains zeroes)?

aloknerurkar commented 1 year ago

So the write for 0-64k comes much later. At this point the initial writes have already happened. So basically, after the 4702208 offset, it would send this. This write overwrites the first 64k bytes to 0. The application/test is not sending this.

aloknerurkar commented 1 year ago

I ran the test by logging all the write ops I get from fuse. Attaching the logs here. So when I was testing the other day I was seeing 64k, but now I am seeing bigger values.

If you check the logs, around L935 is when the problem surfaces. This happens after the writes for the initial offsets have been completed already. I see these write ops which zero out some parts of the file. These are definitely not coming from the tests as in the test it writes the file sequentially with random bytes.

the log line can be read as write <file path> <offset> <file handle> <length of write buf> <first 10 bytes of buf>

log.txt

aloknerurkar commented 1 year ago

@billziss-gh So any more pointers on this? Were you able to check the logs?

billziss-gh commented 1 year ago

If you are getting writes that overwrite legitimate data in macOS my suggestion would be to follow up on the OSXFUSE repo. Cgofuse is a thin layer around different FUSE libraries and would not introduce writes of its own.

asabya commented 1 year ago

We are seeing something like this I suppose.

aloknerurkar commented 1 year ago

@asabya This is exactly what is happening. Seems like in between the writes, we are getting Getattr call which doesn't have a valid filehandle populated even though the file is open.

@billziss-gh Any idea why the filehandle is invalid in the Getattr call?

I think we can close this issue.

billziss-gh commented 1 year ago

Any idea why the filehandle is invalid in the Getattr call?

This is by design. Getattr may be called with or without a file handle. You can identify the no file handle case by checking whether fh == ^uint64(0).