buildbarn / bb-remote-execution

Tools for Buildbarn to allow remote execution of build actions
Apache License 2.0
105 stars 64 forks source link

Fuse filesystem returns EIO on close #135

Open AustinSchuh opened 1 month ago

AustinSchuh commented 1 month ago

I've got a unit test (https://github.com/frc971/971-Robot-Code/blob/master/aos/starter/subprocess_reliable_test.cc in SubprocessTest.CanSlowlyStopGracefully) which works locally, but fails on buildbarn with the runner setup to use fuse on Linux (Debian Bookworm). When run under strace, it reports that close() fails when it never fails when run under linux_sandbox.

openat(AT_FDCWD, "/worker/build/120646/root/_tmp/4464e3f0fe79ed08c06736b5dcca7332/aos/aos/aos.timing.Report.v6", O_RDWR|O_CREAT|O_EXCL, 0770) = -1 EEXIST (File exists)
gettid()                                = 3700
write(2, "I20240507 06:59:55.336262  3700 "..., 170I20240507 06:59:55.336262  3700 memory_mapped_queue.cc:101] /worker/build/120646/root/_tmp/4464e3f0fe79ed08c06736b5dcca7332/aos/aos/aos.timing.Report.v6 already created.
) = 170
openat(AT_FDCWD, "/worker/build/120646/root/_tmp/4464e3f0fe79ed08c06736b5dcca7332/aos/aos/aos.timing.Report.v6", O_RDWR) = 16
newfstatat(16, "", {st_mode=S_IFREG|0777, st_size=627680, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 627680, PROT_READ|PROT_WRITE, MAP_SHARED, 16, 0) = 0x7fb132985000
mmap(NULL, 627680, PROT_READ, MAP_SHARED, 16, 0) = 0x7fb1328eb000
close(16)                               = -1 EIO (Input/output error)

Adding the write (LOG(INFO)) reduces the probability of failure.

I'll keep debugging, but any advice on where to start looking would be wonderful. I can work on creating a smaller reproducer if that would be helpful.

EdSchouten commented 1 month ago

If you see a call to close() fail, it typically means that one or more FUSE WRITE operations failed, which are invoked to flush any data that hasn't been written yet (due to writes happening asynchronously, especially if mmap() is used).

Normally, whenever the FUSE file system in bb_worker returns EIO errors, it also attaches a detailed error message to the action result. See:

https://github.com/buildbarn/bb-remote-execution/blob/master/pkg/filesystem/virtual/pool_backed_file_allocator.go#L534-L535

Can you check the worker logs, the bb-browser page, etc. to see if there's anything useful in there?

AustinSchuh commented 1 month ago

Worker logs have nothing useful.

$ journalct -f | grep -v ExecuteResponse | grep -v "Action: .*with timeout "
May 07 18:24:13 frc9971 kernel: do_coredump: 44 callbacks suppressed
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:13 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:18 frc9971 kernel: do_coredump: 152 callbacks suppressed
May 07 18:24:18 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:18 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:18 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:18 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:18 frc9971 kernel: Core dump to core aborted: cannot preserve file owner
May 07 18:24:18 frc9971 kernel: Core dump to core aborted: cannot preserve file owner

bb-browser page (http://localhost:7984/fuse/blobs/sha256/historical_execute_response/352d5bd473d8692eb27f7d6b2599bed46957885b85c114a12e668e686b40fff5-1067/) has nothing useful either.

action.pdf

I'll see what else I can find.

AustinSchuh commented 1 month ago

I captured an ftrace of the failure and I can see the syscalls, but don't see where it goes wrong. I see an openat for the file in question, see it context switch to bb_worker, do a read, write, read, then context switch back to my application, then map a bunch of memory, do the mmap calls, and then fail on close. This feels like the kernel is doing something else and refusing to close. More investigation needed. (Trying to run it under strace results in it working every time, so this very much feels like a timing problem/race)

EdSchouten commented 1 month ago

Probably a good idea to patch up bb_worker to set the Debug option in the go-fuse MountOptions? https://github.com/hanwen/go-fuse/blob/v2.5.1/fuse/api.go#L222

AustinSchuh commented 1 month ago

Thanks for the ideas! I'm not seeing the difference between a good and bad run.

Bad one:

May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7734: FLUSH n13451513423166901838 {Fh 0}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7734:     OK
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7732:     OK, {n3095509142718977827 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:3095509142718977827 A 946684800.000000 M 1715187627.796060 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7736: LOOKUP n3095509142718977827 ["root"] 5b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7736:     OK, {n557770144382250661 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:557770144382250661 A 946684800.000000 M 1715187627.819385 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7738: LOOKUP n557770144382250661 ["_tmp"] 5b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7738:     OK, {n316263930323349050 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:316263930323349050 A 946684800.000000 M 1715187627.819778 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7740: LOOKUP n316263930323349050 ["ae4be2139ce284579befea4ec6050d03"] 33b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7740:     OK, {n10908353897172168131 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:10908353897172168131 A 946684800.000000 M 1715187629.571002 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7742: LOOKUP n10908353897172168131 ["aos"] 4b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7742:     OK, {n8712379162627258688 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:8712379162627258688 A 946684800.000000 M 1715187627.913751 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7744: LOOKUP n8712379162627258688 ["aos"] 4b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7744:     OK, {n5411399499881996637 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:5411399499881996637 A 946684800.000000 M 1715187628.285035 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7746: LOOKUP n5411399499881996637 ["aos.timing.Report.v6"] 21b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7746:     OK, {n7558966469378381030 g0 tE=300s tA=300s {M0100777 SZ=627680 L=1 0:0 B0*0 i0:7558966469378381030 A 946684800.000000 M 946684800.000000 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7748: OPEN n7558966469378381030 {RDWR,LARGEFILE}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7748:     OK, {Fh 0 }
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7750: READ n7558966469378381030 {Fh 0 [0 +131072)  L 0 RDWR,LARGEFILE}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7750:     OK,  131072b data "!\xe0\x1a^\xa1\x7f\x00\x00"...
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7752: IOCTL n13451513423166901838 *fuse._IoctlIn: &{{72 39 7752 13451513423166901838 {{1001 1001} 1907233} 0} 0 0 21519 140730059675396 0 0} 
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7752:     25=inappropriate ioctl for device
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7754: IOCTL n13451513423166901838 *fuse._IoctlIn: &{{72 39 7754 13451513423166901838 {{1001 1001} 1907233} 0} 0 0 21519 140730059675364 0 0} 
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7756: OPEN n17738275510465726745 {LARGEFILE}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7754:     25=inappropriate ioctl for device
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7756:     OK, {Fh 0 }
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7760: LOOKUP n3366801855024626819 ["sleep"] 6b
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7760:     2=no such file or directory, {n0 g0 tE=300s tA=300s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 946684800.000000 M 946684800.000000 C 946684800.000000}}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 rx 7758: READ n17738275510465726745 {Fh 0 [61440 +131072)  L 0 LARGEFILE}
May 08 10:00:29 frc8971 bash[1907042]: 2024/05/08 10:00:29 tx 7758:     OK,  131072b data "\xa2\x00\x00\x00H\x8dA\x01"...

Working one:

May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41430: FLUSH n17244699383922162051 {Fh 0}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41432: LOOKUP n3430770882850472186 ["root"] 5b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41432:     OK, {n14585254713165270394 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:14585254713165270394 A 946684800.000000 M 1715187716.110167 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41430:     OK   
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41434: LOOKUP n14585254713165270394 ["_tmp"] 5b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41434:     OK, {n11402598398701563849 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:11402598398701563849 A 946684800.000000 M 1715187716.110696 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41436: LOOKUP n11402598398701563849 ["ae4be2139ce284579befea4ec6050d03"] 33b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41436:     OK, {n2403137573748540086 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:2403137573748540086 A 946684800.000000 M 1715187717.906837 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41438: LOOKUP n2403137573748540086 ["aos"] 4b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41438:     OK, {n7578432616521042647 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:7578432616521042647 A 946684800.000000 M 1715187716.202072 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41440: LOOKUP n7578432616521042647 ["aos"] 4b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41440:     OK, {n14975988397830571011 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:14975988397830571011 A 946684800.000000 M 1715187716.560869 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41442: LOOKUP n14975988397830571011 ["aos.timing.Report.v6"] 21b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41442:     OK, {n14503251857110063747 g0 tE=300s tA=300s {M0100777 SZ=627680 L=1 0:0 B0*0 i0:14503251857110063747 A 946684800.000000 M 946684800.000000 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41444: OPEN n14503251857110063747 {RDWR,LARGEFILE}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41444:     OK, {Fh 0 }
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41446: FLUSH n14503251857110063747 {Fh 0}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41446:     OK   
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41448: READ n14503251857110063747 {Fh 0 [0 +131072)  L 0 RDWR,LARGEFILE}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41448:     OK,  131072b data "! \x0f\x93\x12\x7f\x00\x00"...
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41452: READ n14503251857110063747 {Fh 0 [262144 +131072)  L 0 RDWR,LARGEFILE}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41450: READ n14503251857110063747 {Fh 0 [131072 +131072)  L 0 RDWR,LARGEFILE}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41450:     OK,  131072b data "\x00\x00\x00\x00\x00\x00\x00\x00"...
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41452:     OK,  131072b data "\x00\x00\x00\x00\x00\x00\x00\x00"...
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41454: IOCTL n17244699383922162051 *fuse._IoctlIn: &{{72 39 41454 17244699383922162051 {{1001 1001} 1907734} 0} 0 0 21519 140732408482548 0 0} 
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41454:     25=inappropriate ioctl for device
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41456: IOCTL n17244699383922162051 *fuse._IoctlIn: &{{72 39 41456 17244699383922162051 {{1001 1001} 1907734} 0} 0 0 21519 140732408482516 0 0} 
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41458: READ n14503251857110063747 {Fh 0 [393216 +131072)  L 0 RDWR,LARGEFILE}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41460: READ n14503251857110063747 {Fh 0 [524288 +106496)  L 0 RDWR,LARGEFILE}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41458:     OK,  131072b data "\x00\x00\x00\x00\x00\x00\x00\x00"...
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41456:     25=inappropriate ioctl for device
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41460:     OK,  103392b data "\x00\x00\x00\x00\x00\x00\x00\x00"...
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41462: LOOKUP n1236004381729805426 ["sleep"] 6b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41462:     2=no such file or directory, {n0 g0 tE=300s tA=300s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 946684800.000000 M 946684800.000000 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41464: LOOKUP n1 ["34"] 3b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41464:     OK, {n3430770882850472186 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:3430770882850472186 A 946684800.000000 M 1715187716.086146 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41466: LOOKUP n3430770882850472186 ["root"] 5b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41466:     OK, {n14585254713165270394 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:14585254713165270394 A 946684800.000000 M 1715187716.110167 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41468: LOOKUP n14585254713165270394 ["_tmp"] 5b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41468:     OK, {n11402598398701563849 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:11402598398701563849 A 946684800.000000 M 1715187716.110696 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41470: LOOKUP n11402598398701563849 ["ae4be2139ce284579befea4ec6050d03"] 33b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41470:     OK, {n2403137573748540086 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:2403137573748540086 A 946684800.000000 M 1715187717.906837 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41472: LOOKUP n2403137573748540086 ["aos"] 4b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41472:     OK, {n7578432616521042647 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:7578432616521042647 A 946684800.000000 M 1715187716.202072 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41474: LOOKUP n7578432616521042647 ["aos"] 4b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41474:     OK, {n14975988397830571011 g0 tE=300s tA=300s {M040777 SZ=0 L=1 0:0 B0*0 i0:14975988397830571011 A 946684800.000000 M 1715187716.560869 C 946684800.000000}}
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 rx 41476: LOOKUP n14975988397830571011 ["aos.logging.LogMessageFbs.v6"] 29b
May 08 10:01:57 frc8971 bash[1907042]: 2024/05/08 10:01:57 tx 41476:     OK, {n6884875503203238630 g0 tE=300s tA=300s {M0100777 SZ=996960 L=1 0:0 B0*0 i0:6884875503203238630 A 946684800.000000 M 946684800.000000 C 946684800.000000}}

The read at the end of the bad run is where glog is reading the shared libraries to generate a backtrace. (I'm happy to share the full log if that would be helpful).

EdSchouten commented 1 month ago

A random guess, but could you maybe try a workaround like this?

diff --git a/pkg/filesystem/virtual/fuse_handle_allocator.go b/pkg/filesystem/virtual/fuse_handle_allocator.go
index 0540488a..69be9c17 100644
--- a/pkg/filesystem/virtual/fuse_handle_allocator.go
+++ b/pkg/filesystem/virtual/fuse_handle_allocator.go
@@ -300,7 +300,7 @@ func (l *fuseStatefulNativeLeaf) Unlink() {

 func (l *fuseStatefulNativeLeaf) injectAttributes(attributes *Attributes) {
        attributes.SetInodeNumber(l.inodeNumber)
-       attributes.SetLinkCount(l.linkCount.Load())
+       attributes.SetLinkCount(StatelessLeafLinkCount)
        // The change ID should normally also be affected by the link
        // count. We don't bother overriding the change ID here, as FUSE
        // does not depend on it.

I know that Linux has some logic to 'kill' file descriptors if the link count of a file reaches zero. For local file systems this is perfectly fine, but for FUSE/NFS/... this logic might backfire.

tpankaj commented 1 month ago

I tried this for Austin and unfortunately it didn't seem to have any effect on either the behavior or the logs. I'm still trying to see if I can use kprobe to find where exactly the I/O error on flush is being generated.

I did manage to confirm that it's happening on flush, as calling fsync prior to close has the same behavior, with EIO returned by fsync.