sahib / brig

File synchronization on top of ipfs with git like interface & web based UI
https://brig.readthedocs.io
GNU Affero General Public License v3.0
568 stars 33 forks source link

FUSE v0.0.0-20200524192727 breaks test #73

Closed evgmik closed 3 years ago

evgmik commented 3 years ago

Side note: It seems that [upgrading FUSE breaks it](https://github.com/sahib Originally posted by @sahib in https://github.com/sahib/brig/issues/69#issuecomment-752147102

/brig/pull/69/commits/d9514f4823c55967eb0b60085fed3715dd623744). I didn't really check what the cause was, but I think we should update since we're still using a version from 2018. Probably best to incrementally update and see where (and actually what) breaks.

@evgmik: You want to have a look on that?

Sure I will investigate what was changed in FUSE.

Originally posted by @sahib in https://github.com/sahib/brig/issues/69#issuecomment-752147102

evgmik commented 3 years ago

Some info, fuse test are occasionally fail even with our current battletested fuse version v0.0.0-20180421153158-65cc252bf669. When I try go test in brig/fuse dir sometimes I get PASS sometimes I get FAIL

DEBU[0001] adding /hello_64 (Qmb3FtRNALeTb5hN5wt6PyBhjC9PzM5yYiAcBERBzMV7aj)
DEBU[0001] fuse-write: /hello_64 (off: 0 size: 64)
DEBU[0001] fuse-flush: /hello_64
INFO[0001] File exists; modifying.                       file=/hello_64
DEBU[0001] adding /hello_64 (QmbM2fF6AZbazp86zQtopd5n9qH9BM6zWrgus8szcmoaDX)
DEBU[0001] fuse-release: /hello_64
DEBU[0001] fuse-flush: /hello_64
DEBU[0001] fuse-release: /hello_64
--- FAIL: TestWrite (0.18s)
    fuse_test.go:104: Data differs over fuse: got 0, should be 64 bytes
    testutil.go:76: removing temp directory failed: unlinkat /tmp/brig-fuse-mountdir: device or resource busy
2020/12/29 19:58:30 Replaying from value pointer: {Fid:0 Len:0 Offset:0}
2020/12/29 19:58:30 Iterating file id: 0
2020/12/29 19:58:30 Iteration took: 24.457µs

This particular FAIL seems to be related to #70. Looks like if we request a newly created hash from the IPFS backend too quickly, it has no time to produce proper data stream.

This also makes this bug hard to trigger.

evgmik commented 3 years ago

Back to original issue. I bisected basil/fuse version. The first version of basil/fuse which miserably fails tests is v0.0.0-20200419173433-3ba628eaf417. It fails so bad that the load of my machine is 53, though everything still works fine.

I am still working to make this Poller right.

The problem is that basil/fuse introduced Poll(2) https://github.com/bazil/fuse/commit/3ba628eaf417ebd5cc57ced58945d4d39700bcf5 capability.

Somehow it requires Poll method for brig fuse to work. Though everything compiles just fine.

If I introduced a dummy method in brig/fuse/handle.go

func (hd *Handle) Poll(ctx context.Context, req *fuse.PollRequest, resp *fuse.PollResponse) error  { nil }

everything seems to be working and tests are passing, except mentioned above problem.

sahib commented 3 years ago

Good find.

Back to original issue. I bisected basil/fuse version. The first version of basil/fuse which miserably fails tests is v0.0.0-20200419173433-3ba628eaf417. It fails so bad that the load of my machine is 53, though everything still works fine.

Judging from the commit you linked, maybe you need to return syscall.ENOSYS? We don't support polling yet, maybe FUSE was calling that empty function over and over again.