Open pbeza opened 3 years ago
I've compared:
python3 ./xmp.py -d -o root=./local/ ./mnt
strace -o log-sh-xmp.log -f -t sh -c "echo 'onetwo' > ./mnt/test-sh.txt && cat ./mnt/test-sh.txt"
with:
python3 ./xmp.py -d -o root=./local/ ./mnt
strace -o log-fish-xmp.log -f -t fish -c "echo 'onetwo' > ./mnt/test-fish.txt && cat ./mnt/test-fish.txt"
It seems to me that sh
does writing and reading sequentially as opposed to fish
which doesn't wait 5 seconds for close()
completion before starting read()
in parallel (corresponding to cat
command). Is it a bug in fish
...? Or am I misinterpreting it?
Notice that in:
log-sh-xmp.log
you can find 2 processes running – the main one, used for echo
and the other one created with vfork()
+ execve("/bin/cat", ...)
for cat
. We have two processes in here but effectively they run their logic sequentially, one after the other:
435708
does write()
corresponding to echo
command (it uses dup2()
'trick' as described in here):
write(1, "onetwo\n", 7) = 7
435735
does read()
corresponding to cat
command:
read(3, "onetwo\n", 131072) = 7
log-fish-xmp.log
you can find 3 processes running – the main one and two others created with clone()
syscall):
434564
is the main process – it communicates with the other two with pipe()
; it opens ./mnt/test-fish.txt
for writing but it actually doesn't write anything (434565
does):
openat(AT_FDCWD, "/home/patryk/wildland/xmp/mnt/test-fish.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6
434565
calls write()
:
write(6, "onetwo\n", 7) = 7
It also calls close()
which finishes just after 434566
completes which is wrong!!!:
close(6 <unfinished ...>
434566
opens ./mnt/test-fish.txt
:
openat(AT_FDCWD, "./mnt/test-fish.txt", O_RDONLY) = 3
and calls read()
corresponding to cat
command (read()
returns nothing probably because 434565
has not completed closing file descriptor no. 6)
:
read(3, "", 131072) = 0
Respective FUSE log for sh
scenario:
unique: 226, opcode: LOOKUP (1), nodeid: 1, insize: 52
LOOKUP /test-sh.txt
getattr /test-sh.txt
unique: 226, error: -2 (No such file or directory), outsize: 16
unique: 228, opcode: CREATE (35), nodeid: 1, insize: 68
create flags: 0x8241 /test-sh.txt 0100644 umask=0022
create[140297570492080] flags: 0x8241 /test-sh.txt
getattr /test-sh.txt
NODEID: 2
unique: 228, success, outsize: 160
unique: 230, opcode: LOOKUP (1), nodeid: 1, insize: 52
LOOKUP /test-sh.txt
getattr /test-sh.txt
NODEID: 2
unique: 230, success, outsize: 144
unique: 232, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-sh.txt
unique: 232, success, outsize: 120
unique: 234, opcode: LOOKUP (1), nodeid: 1, insize: 52
LOOKUP /test-sh.txt
getattr /test-sh.txt
NODEID: 2
unique: 234, success, outsize: 144
unique: 236, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-sh.txt
unique: 236, success, outsize: 120
unique: 238, opcode: FLUSH (25), nodeid: 2, insize: 64
flush[140297570492080]
unique: 238, success, outsize: 16
unique: 240, opcode: GETXATTR (22), nodeid: 2, insize: 68
unique: 240, error: -38 (Function not implemented), outsize: 16
unique: 242, opcode: WRITE (16), nodeid: 2, insize: 87
write[140297570492080] 7 bytes to 0 flags: 0x8001
write[140297570492080] 7 bytes to 0
unique: 242, success, outsize: 24
unique: 244, opcode: FLUSH (25), nodeid: 2, insize: 64
flush[140297570492080]
unique: 244, success, outsize: 16
unique: 246, opcode: RELEASE (18), nodeid: 2, insize: 64
release[140297570492080] flags: 0x8001
unique: 246, success, outsize: 16
unique: 248, opcode: LOOKUP (1), nodeid: 1, insize: 52
LOOKUP /test-sh.txt
getattr /test-sh.txt
NODEID: 2
unique: 248, success, outsize: 144
unique: 250, opcode: OPEN (14), nodeid: 2, insize: 48
open flags: 0x8000 /test-sh.txt
open[140297570492176] flags: 0x8000 /test-sh.txt
unique: 250, success, outsize: 32
unique: 252, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-sh.txt
unique: 252, success, outsize: 120
unique: 254, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-sh.txt
unique: 254, success, outsize: 120
unique: 256, opcode: READ (15), nodeid: 2, insize: 80
read[140297570492176] 4096 bytes from 0 flags: 0x8000
read[140297570492176] 7 bytes from 0
unique: 256, success, outsize: 23
unique: 258, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-sh.txt
unique: 258, success, outsize: 120
unique: 260, opcode: FLUSH (25), nodeid: 2, insize: 64
flush[140297570492176]
unique: 260, success, outsize: 16
unique: 262, opcode: RELEASE (18), nodeid: 2, insize: 64
release[140297570492176] flags: 0x8000
unique: 262, success, outsize: 16
and for fish
:
unique: 226, opcode: LOOKUP (1), nodeid: 1, insize: 54
LOOKUP /test-fish.txt
getattr /test-fish.txt
unique: 226, error: -2 (No such file or directory), outsize: 16
unique: 228, opcode: CREATE (35), nodeid: 1, insize: 70
create flags: 0x8241 /test-fish.txt 0100644 umask=0022
create[139940152819328] flags: 0x8241 /test-fish.txt
getattr /test-fish.txt
NODEID: 2
unique: 228, success, outsize: 160
unique: 230, opcode: LOOKUP (1), nodeid: 1, insize: 54
LOOKUP /test-fish.txt
getattr /test-fish.txt
NODEID: 2
unique: 230, success, outsize: 144
unique: 232, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-fish.txt
unique: 232, success, outsize: 120
unique: 234, opcode: LOOKUP (1), nodeid: 1, insize: 54
LOOKUP /test-fish.txt
getattr /test-fish.txt
NODEID: 2
unique: 234, success, outsize: 144
unique: 236, opcode: GETXATTR (22), nodeid: 2, insize: 68
unique: 236, error: -38 (Function not implemented), outsize: 16
unique: 238, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-fish.txt
unique: 240, opcode: WRITE (16), nodeid: 2, insize: 87
write[139940152819328] 7 bytes to 0 flags: 0x8001
unique: 238, success, outsize: 120
write[139940152819328] 7 bytes to 0
unique: 240, success, outsize: 24
unique: 242, opcode: FLUSH (25), nodeid: 2, insize: 64
flush[139940152819328]
unique: 244, opcode: LOOKUP (1), nodeid: 1, insize: 54
LOOKUP /test-fish.txt
getattr /test-fish.txt
NODEID: 2
unique: 244, success, outsize: 144
unique: 246, opcode: OPEN (14), nodeid: 2, insize: 48
open flags: 0x8000 /test-fish.txt
open[139940152819472] flags: 0x8000 /test-fish.txt
unique: 246, success, outsize: 32
unique: 248, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-fish.txt
unique: 248, success, outsize: 120
unique: 250, opcode: GETATTR (3), nodeid: 2, insize: 56
getattr /test-fish.txt
unique: 250, success, outsize: 120
unique: 252, opcode: READ (15), nodeid: 2, insize: 80
read[139940152819472] 4096 bytes from 0 flags: 0x8000
read[139940152819472] 0 bytes from 0
unique: 252, success, outsize: 16
unique: 254, opcode: FLUSH (25), nodeid: 2, insize: 64
flush[139940152819472]
unique: 242, success, outsize: 16
unique: 256, opcode: RELEASE (18), nodeid: 2, insize: 64
release[139940152819328] flags: 0x8001
unique: 256, success, outsize: 16
unique: 254, success, outsize: 16
unique: 258, opcode: RELEASE (18), nodeid: 2, insize: 64
release[139940152819472] flags: 0x8000
unique: 258, success, outsize: 16
Quick update: this seems to work fine with Python 3.7, but not with 3.9... Investigating.
Edit: After more tests it seems that the Python version is irrelevant, but fish
version matters: fish 3.1.2 is broken, while 3.2.2 seems to work fine.
Alright, this behavior is the consequence of multithreaded handling of commands. The PYLOCK()
macro is NOT preventing multiple threads to execute fuse commands in parallel. In this case: two flush
commands are being executed on the same file due to fish
design. So far I've tested that adding for example pthread_mutex_lock()
in PYLOCK
and corresponding pthread_mutex_unlock()
in PYUNLOCK
produces correct behavior on the problematic fish 3.1.2. But then, what's the point of multithreading if we basically enforce executing one command at a time? :)
Edit: of course all this is mostly relevant for operations on the same file object. But we can have multiple handles to the same file, or different paths etc, so it's not trivial to only lock in such case. Anyway, this can be solved in the user handler class so IMO it's not a python-fuse issue. The user fuse class needs to ensure that parallel operations on the same file are serialized.
Anyway, this can be solved in the user handler class so IMO it's not a python-fuse issue. The user fuse class needs to ensure that parallel operations on the same file are serialized.
Is it possible to serialize parallel operations on the same file using FUSE's high level API (which python-fuse
is based on)? High level API operates exclusively on paths and hides away inodes from developers.
Any ideas on how to fix a given code sample without disabling multithreading completely?
Inodes vs paths should not be an issue here. Each open file gets unique file handle (object) in both APIs. IMO the main question is whether it needs to be serialized in context of a single file handle (the same open FD from the user PoV), or across different handles for the same file (path/inode). The former is easier and will have very minimal performance impact, the latter is harder and might have bigger performance impact. Based on the above snipped doesn't have fgetattr
implemented at all, I'm afraid it's the latter case.
Here I updated xmp.py with broader locks (the latter case): https://gist.github.com/marmarek/a56ae864b61cba4df32792c99644dda5 - I've added initial version first, so you can easily see the diff. This is obviously PoC only, it leaks locks instances left and right. I have not tested it.
@marmarek 's approach works (the gist above fails because of a typo in line 71). Fuse also doesn't seem to like the partial constructor (write: Function not implemented
). Quick test when I moved iolocks
to global scope shows that works.
Just to make sure you haven't missed it – in this particular example it is sufficient to replace:
def read(self, length, offset):
with self.iolock:
self.file.seek(offset)
return self.file.read(length)
def write(self, buf, offset):
with self.iolock:
self.file.seek(offset)
self.file.write(buf)
return len(buf)
with:
def read(self, length, offset):
return os.pread(self.fd, length, offset)
def write(self, buf, offset):
return os.pwrite(self.fd, buf, offset)
to make it work (as mentioned in my original post). No additional locking was needed besides the above code modification. But still the question persists: what is the minimal locking needed to make it work in every possible parallel/multithreaded test scenario (i.e. not just for fish
use case)?
For the record, I bisected fish
and this is the commit that fixes the issue: https://github.com/fish-shell/fish-shell/commit/4b9a096cf2b754f70cb0b020dd353998d88e6bd2
I've prepared a toy example based on
xmp.py
which proofs that there is a bug somewhere:Notice that there is
time.sleep(5)
added inflush()
method which simulates network lag (if you remove this line, the problem is gone).Lets mount our toy file system:
Test with
bash
Now open
bash
terminal and run:Test with
fish
Open
fish
terminal and run:What the heck? Different results for different terminals! I've also checked
zsh
andcsh
– they behave same asbash
. It seems that onlyfish
is affected.Log for
bash
testLog for
fish
testNotice that
FLUSH
(triggered byecho
):returns after
READ
(triggered bycat
) returns:which is too late because
test-fish.txt
was not yet saved/flushed when we started reading it! Also notice lateRELEASE
for bothecho
andcat
(in opposite tobash
log).Caveat: if you run add
sleep 5
command betweenecho
andcat
, it works even withfish
terminal:Do you have any idea where may be the bug?
Update
-s
flag to mounting options (which disables multi-threaded operation), it works just fine for bothfish
andbash
.fusexmp.c
), using originallibfuse
(ver. 2.9.9) and it works just fine as opposed to the above Python version.If I replace:
with:
it works for both
fish
andbash
(maybe it has something to do with #18 @drougge?).it reports one issue which seems to be a not false positive :