yandex-cloud / geesefs

Finally, a good FUSE FS implementation over S3
Other
691 stars 45 forks source link

geesefs 0.38.5 crash #95

Closed bobelev closed 11 months ago

bobelev commented 1 year ago

Version 0.38.5 S3 — AWS Args:

  -f \
  -o allow_other \
  --region=eu-central-1 \
  --file-mode=0660 \
  --log-file=/var/log/geesefs.log \
  --debug \
  --debug_fuse \
  --debug_s3 \
  --dir-mode=0770 \
  --uid=0 \
  --gid=123 \
  --list-type=1 \
  --endpoint=https://s3.eu-central-1.amazonaws.com

Logs:

2023/11/20 12:22:35.032658 fuse.DEBUG < ReadFile 71 filename.dat [131072 <nil>]
2023/11/20 12:22:35.032705 fuse.DEBUG Op 0x00000442        connection.go:517] -> OK ()
2023/11/20 12:22:35.033640 fuse.DEBUG Op 0x00000444        connection.go:428] <- GetXattr (inode 71, name "security.capability", PID 343132, name security.capability)
2023/11/20 12:22:35.033737 fuse.DEBUG GetXattr 71 filename.dat [security.capability]
2023/11/20 12:22:35.033784 fuse.DEBUG Op 0x00000444        connection.go:519] -> Error: "no data available"
2023/11/20 12:22:35.033830 fuse.DEBUG Op 0x00000446        connection.go:428] <- SetInodeAttributes (inode 71, PID 343132, size 0)
2023/11/20 12:22:35.033879 fuse.DEBUG Op 0x00000446        connection.go:517] -> OK ()
2023/11/20 12:22:35.033967 fuse.DEBUG Op 0x00000448        connection.go:428] <- GetXattr (inode 71, name "security.capability", PID 343132, name security.capability)
2023/11/20 12:22:35.034052 fuse.DEBUG GetXattr 71 filename.dat [security.capability]
2023/11/20 12:22:35.034116 fuse.DEBUG Op 0x00000448        connection.go:519] -> Error: "no data available"
2023/11/20 12:22:35.034169 fuse.DEBUG Op 0x0000044a        connection.go:428] <- WriteFile (inode 71, PID 343132, handle 2, offset 0, 310 bytes)
2023/11/20 12:22:35.034281 fuse.DEBUG WriteFile 71 filename.dat [0 310]
panic: Tried to insert out of order: 0+136 before 0+136 (s2)

goroutine 658 [running]:
github.com/yandex-cloud/geesefs/internal.(*Inode).insertOrAppendBuffer(0xc0002be200, 0x0, 0x136?, {0xc0005f8050?, 0x136, 0x20fb0?}, 0x2?, 0x1?, 0xc0008180c0?)
        /home/runner/work/geesefs/geesefs/internal/file.go:153 +0xb25
github.com/yandex-cloud/geesefs/internal.(*Inode).addBuffer(0xc0002be200, 0x0, {0xc0005f8050, 0x136, 0x20fb0}, 0x2?, 0x0?)
        /home/runner/work/geesefs/geesefs/internal/file.go:238 +0x1e5
github.com/yandex-cloud/geesefs/internal.(*FileHandle).WriteFile(0xc000598050, 0x0, {0xc0005f8050, 0x136, 0x20fb0}, 0x18?)
        /home/runner/work/geesefs/geesefs/internal/file.go:483 +0x3ce
github.com/yandex-cloud/geesefs/internal.(*GoofysFuse).WriteFile(0xc0005960c0, {0x12612d8?, 0xc0005b5aa0?}, 0xc0007b0200)
        /home/runner/work/geesefs/geesefs/internal/goofys_fuse.go:726 +0xf6
github.com/jacobsa/fuse/fuseutil.(*fileSystemServer).handleOp(0x0?, 0x0?, {0x12612d8, 0xc00009e2d0}, {0xe57820?, 0xc0007b0200?})
        /home/runner/go/pkg/mod/github.com/vitalif/fusego@v0.0.0-20230810211941-8d4d89b65d93/fuseutil/file_system.go:216 +0xaf6
created by github.com/jacobsa/fuse/fuseutil.(*fileSystemServer).ServeOps
        /home/runner/go/pkg/mod/github.com/vitalif/fusego@v0.0.0-20230810211941-8d4d89b65d93/fuseutil/file_system.go:128 +0x215
(END)
bobelev commented 1 year ago

Read the code. Is this error because I'm trying to write to the begging of the file?

vitalif commented 1 year ago

Hi, does it reproduce every time? It seems like a really basic use case which is definitely covered by tests This is a bug of course if you can reproduce it

bobelev commented 1 year ago

Yes. It is reproduced every time in my usecase. I use geesefs for Bacula backups. It works fine for creating new snapshots. But when Bacula tries to "Recyle" existing backup, geesefe crashes.

I'll gather more debug logs later. If you want to request something like strace, I can do that. Just let me know.

vitalif commented 1 year ago

Debug logs with --debug_s3 --debug_fuse --log-file /path/to/log.txt during error reproduction will be fine. Simple use case where I take an existing file and just write 136 bytes to its beginning with dd if=/dev/urandom of=test.2.0 bs=136 count=1 conv=notrunc works fine for me...

vitalif commented 11 months ago

Please recheck with 0.40.0, all this buffer-juggling code has been refactored in it :)