hanwen / go-fuse

FUSE bindings for Go
Other
2.03k stars 324 forks source link

fs: TestStaleHardlinks is flaky #478

Closed navytux closed 4 months ago

navytux commented 1 year ago

From https://github.com/hanwen/go-fuse/issues/351#issuecomment-1583044571 :

I was running tests in a loop as:

$ while true; do echo; go test -p 1 -count 1 ./... || break ; done

and got breakage in fs in ~ 5 minutes:

...
warning: Inode.Path: n2 is orphaned, replacing segment with ".go-fuse.12305257969772279832/deleted"
20:03:22.593626 writer: Write/Writev failed, err: 2=no such file or directory. opcode: RELEASEDIR
warning: Inode.Path: n3 is orphaned, replacing segment with ".go-fuse.901660240436591636/deleted"
warning: Inode.Path: n854 is orphaned, replacing segment with ".go-fuse.4890713566819029247/deleted"
warning: Inode.Path: n951 is orphaned, replacing segment with ".go-fuse.15899998848308155049/deleted"
warning: Inode.Path: n2084 is orphaned, replacing segment with ".go-fuse.2297604607127977387/deleted"
warning: Inode.Path: n2814 is orphaned, replacing segment with ".go-fuse.14552686421981569371/deleted"
warning: Inode.Path: n2990 is orphaned, replacing segment with ".go-fuse.9123635009483468768/deleted"
warning: Inode.Path: n3170 is orphaned, replacing segment with ".go-fuse.15177763141428033337/deleted"
warning: Inode.Path: n3342 is orphaned, replacing segment with ".go-fuse.3646338540004335763/deleted"
warning: Inode.Path: n3423 is orphaned, replacing segment with ".go-fuse.1318443847442105450/deleted"
warning: Inode.Path: n3530 is orphaned, replacing segment with ".go-fuse.4342381435690588638/deleted"
warning: Inode.Path: n3617 is orphaned, replacing segment with ".go-fuse.1428700877519216651/deleted"
warning: Inode.Path: n3711 is orphaned, replacing segment with ".go-fuse.401526560035548991/deleted"
warning: Inode.Path: n5425 is orphaned, replacing segment with ".go-fuse.9338043697404393728/deleted"
warning: Inode.Path: n7037 is orphaned, replacing segment with ".go-fuse.729628660140245879/deleted"
warning: Inode.Path: n7235 is orphaned, replacing segment with ".go-fuse.1369596357678539859/deleted"
warning: Inode.Path: n7461 is orphaned, replacing segment with ".go-fuse.1369287383292163374/deleted"
warning: Inode.Path: n8305 is orphaned, replacing segment with ".go-fuse.17808099055425098859/deleted"
warning: Inode.Path: n9306 is orphaned, replacing segment with ".go-fuse.5174121311455743018/deleted"
warning: Inode.Path: n9811 is orphaned, replacing segment with ".go-fuse.2107295909796755029/deleted"
warning: Inode.Path: n9918 is orphaned, replacing segment with ".go-fuse.5183084698562304044/deleted"
warning: Inode.Path: n10824 is orphaned, replacing segment with ".go-fuse.486187419106585934/deleted"
warning: Inode.Path: n12942 is orphaned, replacing segment with ".go-fuse.8444501897156723168/deleted"
warning: Inode.Path: n13432 is orphaned, replacing segment with ".go-fuse.4542056890342277729/deleted"
warning: Inode.Path: n13637 is orphaned, replacing segment with ".go-fuse.6102584255343135637/deleted"
warning: Inode.Path: n13833 is orphaned, replacing segment with ".go-fuse.866370257234569186/deleted"
warning: Inode.Path: n13930 is orphaned, replacing segment with ".go-fuse.15087026409322293597/deleted"
warning: Inode.Path: n14713 is orphaned, replacing segment with ".go-fuse.11648377526940701427/deleted"
warning: Inode.Path: n15106 is orphaned, replacing segment with ".go-fuse.12246981705567265835/deleted"
warning: Inode.Path: n15524 is orphaned, replacing segment with ".go-fuse.16982074749685265062/deleted"
warning: Inode.Path: n18203 is orphaned, replacing segment with ".go-fuse.15654435576189543409/deleted"
warning: Inode.Path: n19304 is orphaned, replacing segment with ".go-fuse.4095005409658110961/deleted"
warning: Inode.Path: n19511 is orphaned, replacing segment with ".go-fuse.18158904314772731366/deleted"
warning: Inode.Path: n20540 is orphaned, replacing segment with ".go-fuse.13427310443358862917/deleted"
warning: Inode.Path: n21135 is orphaned, replacing segment with ".go-fuse.7950643906035889115/deleted"
warning: Inode.Path: n21330 is orphaned, replacing segment with ".go-fuse.8350283527915559158/deleted"
warning: Inode.Path: n21728 is orphaned, replacing segment with ".go-fuse.14787976173072795546/deleted"
warning: Inode.Path: n21923 is orphaned, replacing segment with ".go-fuse.7790348919694619749/deleted"
warning: Inode.Path: n22226 is orphaned, replacing segment with ".go-fuse.14996396507077797745/deleted"
warning: Inode.Path: n23026 is orphaned, replacing segment with ".go-fuse.16890645689759266835/deleted"
warning: Inode.Path: n23318 is orphaned, replacing segment with ".go-fuse.6196426916292442094/deleted"
warning: Inode.Path: n23601 is orphaned, replacing segment with ".go-fuse.17054134701676197557/deleted"
warning: Inode.Path: n24004 is orphaned, replacing segment with ".go-fuse.366239376632507138/deleted"
warning: Inode.Path: n24722 is orphaned, replacing segment with ".go-fuse.11359620588908067237/deleted"
warning: Inode.Path: n24812 is orphaned, replacing segment with ".go-fuse.4242556164732877867/deleted"
warning: Inode.Path: n25781 is orphaned, replacing segment with ".go-fuse.5436548359282688598/deleted"
warning: Inode.Path: n26079 is orphaned, replacing segment with ".go-fuse.5542355895719171192/deleted"
warning: Inode.Path: n26672 is orphaned, replacing segment with ".go-fuse.205714894418993594/deleted"
warning: Inode.Path: n26875 is orphaned, replacing segment with ".go-fuse.4185528009913706605/deleted"
warning: Inode.Path: n26974 is orphaned, replacing segment with ".go-fuse.17867490425958263130/deleted"
warning: Inode.Path: n27071 is orphaned, replacing segment with ".go-fuse.7224419671554691606/deleted"
warning: Inode.Path: n28066 is orphaned, replacing segment with ".go-fuse.11478882319185392135/deleted"
--- FAIL: TestStaleHardlinks (0.01s)
    simple_test.go:665: no such file or directory
FAIL
FAIL    github.com/hanwen/go-fuse/fs    2.717s

go-fuse: v2.3.0-5-g90b055a os: Linux deca 6.1.0-9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.27-1 (2023-05-08) x86_64 GNU/Linux go: go version go1.20.5 linux/amd64

I also do see fs tests failing on github CI, e.g. here: https://github.com/hanwen/go-fuse/actions/runs/5210956777/jobs/9402681092#step:6:48 .

navytux commented 1 year ago

Reliable way to reproduce:

(neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/github.com/hanwen/go-fuse/fs$ go test -run TestStaleHardlinks -count 1000 -failfast
--- FAIL: TestStaleHardlinks (0.00s)
    simple_test.go:675: no such file or directory
FAIL
exit status 1
FAIL    github.com/hanwen/go-fuse/fs    0.081s

(neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/github.com/hanwen/go-fuse/fs$ go test -run TestStaleHardlinks -count 1000 -failfast
--- FAIL: TestStaleHardlinks (0.00s)
    simple_test.go:675: no such file or directory
FAIL
exit status 1
FAIL    github.com/hanwen/go-fuse/fs    0.918s

(neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/github.com/hanwen/go-fuse/fs$ go test -run TestStaleHardlinks -count 1000 -failfast
--- FAIL: TestStaleHardlinks (0.01s)
    simple_test.go:675: no such file or directory
FAIL
exit status 1
FAIL    github.com/hanwen/go-fuse/fs    6.869s

go-fuse v2.3.0-11-g255ab74

hanwen commented 1 year ago

this test was introdcued in c3186132 by @rfjakob . It's not representative of a correctly written FUSE filesystem (the "behind the back" deletions should be coupled with notifications back to the kernel).

the follow-on a90e1f463c3f172a7690a6449fe5955a180dfec3 says that the test is fixed now, but I don't understand why that would make it work.

rfjakob commented 1 year ago

What I meant with "behind the back" is that somebody else is modifying your backing file store (if you have one, like loopback or gocryptfs). That can happen when your files are on NFS or something and another host modifies it, or sync tools like Dropbox doing their thing.

But in any case, the test should not blow up, I'll try to take a look.

rfjakob commented 1 year ago

Oh. Excerpt below, full log here: https://gist.github.com/rfjakob/e812730df3918122c3393d010ed4db43

tx 206 gets heavily delayed after other LOOKUPs have succeeded.

21:26:56.191773 rx 206: LOOKUP n1 ["link11"] 7b
21:26:56.191800 rx 208: LOOKUP n1 ["link0"] 6b
21:26:56.191819 tx 208:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=16 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.190038}}
21:26:56.191913 rx 210: LOOKUP n1 ["link16"] 7b
21:26:56.191943 tx 210:     2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
21:26:56.192006 rx 212: LINK n1 {Oldnodeid: n2} ["link16"] 7b
21:26:56.192126 tx 212:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=17 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.191037}}
21:26:56.192245 rx 214: LOOKUP n1 ["link0"] 6b
21:26:56.192339 tx 214:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=17 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.191037}}
21:26:56.192436 rx 216: LOOKUP n1 ["link17"] 7b
21:26:56.192524 tx 216:     2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
21:26:56.192650 rx 218: LINK n1 {Oldnodeid: n2} ["link17"] 7b
21:26:56.192686 tx 218:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=18 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.191037}}
21:26:56.192719 rx 220: LOOKUP n1 ["link0"] 6b
21:26:56.192738 tx 220:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=18 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.191037}}
21:26:56.192778 rx 222: LOOKUP n1 ["link18"] 7b
21:26:56.192795 tx 222:     2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
21:26:56.192828 rx 224: LINK n1 {Oldnodeid: n2} ["link18"] 7b
21:26:56.192858 tx 224:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=19 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.192036}}
21:26:56.192933 rx 226: LOOKUP n1 ["link0"] 6b
21:26:56.192962 tx 226:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=19 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.192036}}
21:26:56.193021 rx 228: LOOKUP n1 ["link19"] 7b
21:26:56.193076 tx 228:     2=no such file or directory, {n0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
21:26:56.193127 rx 230: LINK n1 {Oldnodeid: n2} ["link19"] 7b
21:26:56.193185 tx 230:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=20 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.192036}}
21:26:56.193326 rx 232: LOOKUP n1 ["link0"] 6b
21:26:56.193381 tx 232:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=1 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.192036}}
21:26:56.200238 tx 206:     OK, {n2 g1 tE=0s tA=0s {M0100600 SZ=0 L=16 1026:1026 B0*4096 i0:39670 A 1688844416.180050 M 1688844416.180050 C 1688844416.190038}}
21:26:56.200243 rx 234: OPEN n2 {0x8000} 
21:26:56.200292 rx 236: GETATTR n2 {Fh 0 } 
21:26:56.200293 tx 234:     2=no such file or directory, {Fh 0 }
21:26:56.200326 tx 236:     2=no such file or directory, {tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}}
hanwen commented 1 year ago

@rfjakob - any update here?

rfjakob commented 1 year ago

Sorry, no, other things going on. Could you t.Skip() it for now?

On Sat, 9 Sept 2023, 11:57 Han-Wen Nienhuys, @.***> wrote:

@rfjakob https://github.com/rfjakob - any update here?

— Reply to this email directly, view it on GitHub https://github.com/hanwen/go-fuse/issues/478#issuecomment-1712472843, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA77E6G7JLIR2AZQ2JYDXZQ4SJANCNFSM6AAAAAAZDD2N7M . You are receiving this because you were mentioned.Message ID: @.***>

rfjakob commented 4 months ago

I can still repro, but not reliably, and the logs are noisy.

Looks like gvfsd-trash sets inotify_add_watch on the mountpoint and loops over all files. This is the noise in the logs.

$ ps -eLf | grep 63100
jakob      63095    2691   63100  0    5 22:22 ?        00:00:04 /usr/libexec/gvfsd-trash --spawner :1.16 /org/gtk/gvfs/exec_spaw/0

strace shows:

[pid 63116] inotify_add_watch(8, "/tmp/TestStaleHardlinks809765178/001/mnt", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF|IN_UNMOUNT|IN_ONLYDIR) = 20256
rfjakob commented 4 months ago

pkill gvfsd-trash makes the problem go away.

The extra LOOKUPs from gvfsd-trash race against the test logic.

rfjakob commented 4 months ago

Fixed via https://review.gerrithub.io/c/hanwen/go-fuse/+/1195362

navytux commented 4 months ago

Thanks, @rfjakob.

navytux commented 4 months ago

P.S. https://review.gerrithub.io/c/hanwen/go-fuse/+/1195362 was submitted but did not appear in master on github. Probably there is again gerrithub -> github synchronization issue.