Closed Dadido3 closed 1 year ago
you're saying you have no idea whether it's CIFS or bcachefs. To narrow down: have you tried both ksmbd and samba?
EDIT: as you say you kept every value as its default one: it is strange you're not hitting any checksum errors ... which may indicate the file was already corrupted before bcachefs takes care of it. So that's why I'd like you to try ksmbd as well
NFS works fine, i have a proxmox backup server running on another machine that mounts to this FS via NFS. Over some months it has probably written and verified a few TB of data without problems.
Thanks for the idea with ksmbd. I just enabled support and recompiled the kernel. It's now copying files, and i'll report back when i get a result.
NFS works fine, i have a proxmox backup server running on another machine that mounts to this FS via NFS. Over some months it has probably written and verified a few TB of data without problems.
Thanks for the idea with ksmbd. I just enabled support and recompiled the kernel. It's now copying files, and i'll report back when i get a result.
it may be a memory error in samba causing your file corruption. when a ring buffer underflows, the data gets written twice in a way that may actually look like what we're seeing here. increasing the send / receive buffers could help.
So this implicates the write path somehow, but not the normal IO paths.
Could you try the bcachefs-splice-disable branch? Let's see if it's a splice bug
I have now copied enough data via ksmbd that i can confidently say the corruption doesn't happen with ksmbd. But this just means that i can, again, not rule out either bcachefs or smb.
Could you try the bcachefs-splice-disable branch? Let's see if it's a splice bug
I'll revert back to Samba and try that tomorrow.
I tried to compile the bcachefs-splice-disable
branch, but i end up with the following error:
...
CC [M] drivers/iio/light/vl6180.o
CC [M] drivers/iio/temperature/tsys02d.o
CC [M] drivers/iio/light/zopt2201.o
AR built-in.a
AR vmlinux.a
LD vmlinux.o
OBJCOPY modules.builtin.modinfo
GEN modules.builtin
GEN .vmlinux.objs
MODPOST Module.symvers
ERROR: modpost: "__SCT__tp_func_contention_begin" [fs/bcachefs/bcachefs.ko] undefined!
ERROR: modpost: "__SCT__tp_func_contention_end" [fs/bcachefs/bcachefs.ko] undefined!
ERROR: modpost: "__tracepoint_contention_begin" [fs/bcachefs/bcachefs.ko] undefined!
ERROR: modpost: "__tracepoint_contention_end" [fs/bcachefs/bcachefs.ko] undefined!
ERROR: modpost: "__SCK__tp_func_contention_end" [fs/bcachefs/bcachefs.ko] undefined!
ERROR: modpost: "__SCK__tp_func_contention_begin" [fs/bcachefs/bcachefs.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:136: Module.symvers] Error 1
make: *** [Makefile:1978: modpost] Error 2
I tried to compile the
bcachefs-splice-disable
branch, but i end up with the following error:... CC [M] drivers/iio/light/vl6180.o CC [M] drivers/iio/temperature/tsys02d.o CC [M] drivers/iio/light/zopt2201.o AR built-in.a AR vmlinux.a LD vmlinux.o OBJCOPY modules.builtin.modinfo GEN modules.builtin GEN .vmlinux.objs MODPOST Module.symvers ERROR: modpost: "__SCT__tp_func_contention_begin" [fs/bcachefs/bcachefs.ko] undefined! ERROR: modpost: "__SCT__tp_func_contention_end" [fs/bcachefs/bcachefs.ko] undefined! ERROR: modpost: "__tracepoint_contention_begin" [fs/bcachefs/bcachefs.ko] undefined! ERROR: modpost: "__tracepoint_contention_end" [fs/bcachefs/bcachefs.ko] undefined! ERROR: modpost: "__SCK__tp_func_contention_end" [fs/bcachefs/bcachefs.ko] undefined! ERROR: modpost: "__SCK__tp_func_contention_begin" [fs/bcachefs/bcachefs.ko] undefined! make[1]: *** [scripts/Makefile.modpost:136: Module.symvers] Error 1 make: *** [Makefile:1978: modpost] Error 2
Maybe you could make it not as a module ,just compile into kernel
I have only checked out the git branch and built it the usual way:
cp /boot/config-`uname -r` .config
make menuconfig
make clean
make -j 8
make modules_install
make install
The error happened on make -j 8
, no idea what i did wrong.
Anyways, i have now applied the changes of 6dd7cbb1bdd0cad6f9774e1572cee63c1d90db87 on top of the last commit that i have used (b0788c4). This is compiling fine and i'll test with that for now.
Ok, i still get corrupted files, even with the changes of commit 6dd7cbb applied.
So we're going to need to figure out what samba is doing different. Could you do an strace of samba while copying some data over? We need to know exactly what IO paths it's going through
Ok, running strace -f -e trace=\!%network -p 2886 -o strace.txt
gave me:
While the trace was running i copied a folder with 3 png files to the bcachefs drive.
Also, right now i'm testing Samba in combination with XFS and BTRFS.
Edit: I could actually just let strace run while i'm copying files over and in case i encounter a corruption check what the actual syscalls were for that particular file and offset.
Alright, i just got some corrupted files again, but this time i logged everything with strace. What i see in the corrupted file corresponds what's happening in the trace, which i have shortened to the important parts:
...
2886 openat(39, "03 - Propane Nightmares.flac", O_RDWR|O_CREAT|O_EXCL|O_NONBLOCK|O_NOFOLLOW, 0764) = 42
...
2886 ftruncate(42, 42936176) = 0
...
4118 pwrite64(42, "\322ly\377)7/\307\4+\345\232\216\252\217\21<\304\315\f`$L\27N\222,\"q\322\247D"..., 1048576, 35651584 <unfinished ...>
2886 epoll_wait(5, [{events=EPOLLIN, data={u32=2716893328, u64=94423277928592}}], 1, 1) = 1
2886 epoll_wait(5, [{events=EPOLLIN, data={u32=2716893328, u64=94423277928592}}], 1, 1) = 1
4118 <... pwrite64 resumed>) = 1044496
4118 pwrite64(42, "\206\361\27\351\301\352\276\f\36\360q|f\276\223\211\231\211\37D7S\371\303\7\236\35\n]\377\237\344"..., 4080, 36696080 <unfinished ...>
2886 epoll_wait(5, <unfinished ...>
4118 <... pwrite64 resumed>) = 4080
...
Basically, samba tries to write 1048576 bytes at offset 35651584, but only 1044496 are actually written. Samba then rewrites the missing 4080 bytes at offset 36696080.
The corrupted range starts at offset 35786752 and has a size of 909328 bytes. Comparing the original and corrupt files, it seems like bcachefs has swallowed 4080 bytes in the middle of that write operation (at offset 35786752), and then continued to write the rest of the data, but shifted by that missing 4080 bytes.
The data starting at offset 36696080 is correct again, as that was the 4080 bytes that samba has rewritten because it noticed that pwrite64 returned less than it should have.
I think that rules out samba, so it's most likely a bug in bcachefs (or something inbetween bcachefs and the pwrite64 syscall? idk).
Edit:
As the numbers are maybe too abstract, here is an image of what's happening, if that makes more sense:
That's some slick debugging work!
I went hunting for a bug that matched, and I believe I found it. Can you try the bcachefs-testing branch?
I've written and verified ~2 TB for now, so i guess it's fixed, and i'll let it run a bit more just to be sure.
Thanks a lot to all involved!
That was some slick debugging, thanks for the writeup!
On Tue, Aug 15, 2023, 2:16 PM David Vogel @.***> wrote:
I've written and verified ~2 TB for now, so i guess it's fixed, and i'll let it run a bit more just to be sure.
Thanks a lot to all involved!
— Reply to this email directly, view it on GitHub https://github.com/koverstreet/bcachefs/issues/581#issuecomment-1679384811, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPGX3S4727N6OMN57W33QLXVO4HZANCNFSM6AAAAAA3OGOUUA . You are receiving this because you commented.Message ID: @.***>
Version
bcachefs: b0788c47d9 bcachefs-tools: f3976e3733
Generic info
Kernel was compiled with:
dmsg
didn't provide any useful information.Problem
When i move files from my PC to my server with bcachefs via SMB/CIFS, files get corrupted sometimes. This happens irregularly and is hard to reproduce, as sometimes it may just not happen.
I've been testing for weeks now, shuffling around tens of TB back and forth trying to pinpoint the exact cause. Now i'm at a point where i'm pretty sure this is either a bug in bcachefs or SMB, but i'm clueless and i wasn't able to fully rule any of the two. Down below are some notes and details i made during testing.
Notes and details
Bcachefs kernel running in QEmu VM, disks are passed through as block devices. Running Debian bookworm as distribution.
Happens regardless of the compression setting, even with no compression.
The compression was changed several times while files were actively transferred.It happens withzstd
, and before it also happened withlz4
.Happens on a freshly formatted bcachefs FS via
bcachefs format /dev/sd[gh]
, so default values and no further modifications.Some devices were evacuated and then removed (2023-07-29, 2023-07-30) after the files have been transferred.Also happened when no devices were changed/evacuated.But a background target was set, so data was probably also rebalanced when it was copied.There were some old striped buckets in the filesystem that were only fully deleted once the before mentioned devices have been evacuated.This was from some old (few months before) testing with erasure coding.The corrupted files read without any errors/warnings. The FS seems to consider them to be correct.
It's unlikely to be the RAM, as both machines were tested with
Memtest86+
for 3-4 rounds. Also, the corruption is not random bit flips. There is always a really specific mapping/shifting of bytes going on, see below.It's either a bug in the filesystem, or in the CIFS/SMB server. I haven't been able to rule either out: I couldn't reproduce the error when using NFS or FTP. On the other hand SMB works fine with ext4 file systems. This needs more testing.
It's not the server's network adapter, as it happens with different adapters (2.5Gb/s RTL8125 and some other 1Gb/s Intel NIC)
Sometimes there is a broken file every few 100 GB of copied files, other times several TB of data copy just fine. Corruption seems to happen in clumps, sometimes several occurrences per file.
Other direction is fine (NAS/server --> PC)
Corruption example
Let's say we have the original file X, and we copy it to the NAS and it got corrupted (we call the corrupted file Y):
Most of the data of Y is identical to X. But there will be one (and rarely more) corrupted chunks, mostly with a size of ~1MB or less. The corrupted chunks in file Y always contain data of X, but shifted by some random offset. Here is an illustration that should make it more clear:
An example corruption pattern:
0xB32D000
-0xB3FF9EF
(862704 bytes)0xB32D000
-0xB3FF9EF
corresponds exactly with data of file X at0xB32D610
-0xB3FFFFF
.Also to note, corrupted range is always aligned to multiple of 4096 bytes, and the end of the "source" range in X is also always aligned to 4096 bytes.
Every corrupted file i've seen so far has shown this pattern, just with different values. No idea if this information helps, but the pattern is too unique to exclude it from this issue.