koverstreet / bcachefs

Other
667 stars 70 forks source link

Errors when using more than 10 block devices #734

Open KillerLink opened 2 weeks ago

KillerLink commented 2 weeks ago

tl;dr: On a fresh install of arch linux, i seem to have trouble creating a bcachefs system with more than 10 block devices behind it.

Here's the OS, bcachefs version and disks i am working with:

#uname -a
Linux instant-arch-481e21d0 6.10.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 19 Aug 2024 17:02:39 +0000 x86_64 GNU/Linux
#bcachefs version
1.9.5
#lsblk 
[root@instant-arch-481e21d0 ~]# lsblk
NAME                      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda                         8:0    0 238.5G  0 disk  
├─sda1                      8:1    0     1G  0 part  /efi
├─sda2                      8:2    0     1G  0 part  /boot
├─sda3                      8:3    0     2G  0 part  [SWAP]
├─sda4                      8:4    0 234.4G  0 part  
│ └─crypt_system_481e21d0 254:0    0 234.4G  0 crypt /home
│                                                    /snapshots
│                                                    /
└─sda20                   259:0    0     8M  0 part  
sdb                         8:16   0 465.8G  0 disk  
sdc                         8:32   0 931.5G  0 disk  
sdd                         8:48   0 931.5G  0 disk  
sde                         8:64   0 931.5G  0 disk  
sdf                         8:80   0 931.5G  0 disk  
sdg                         8:96   0 931.5G  0 disk  
sdh                         8:112  0 931.5G  0 disk  
sdi                         8:128  0 931.5G  0 disk  
└─sdi1                      8:129  0 931.5G  0 part  
sdj                         8:144  0 931.5G  0 disk  
sdk                         8:160  0 465.8G  0 disk  
sdl                         8:176  0 931.5G  0 disk  
├─sdl1                      8:177  0   128M  0 part  
└─sdl2                      8:178  0 931.4G  0 part  
sdm                         8:192  0 465.8G  0 disk  
sdn                         8:208  0 465.8G  0 disk  
sdo                         8:224  0 931.5G  0 disk  
sdp                         8:240  0 931.5G  0 disk  
zram0                     253:0    0     2G  0 disk  [SWAP]

The following works as expected:

#wipefs -a /dev/sd{b,c,d,e,f,g,h,k,m,n,o,p} 
#bcachefs format --fs_label=bcfs /dev/sd{b,c,d,e,f,g,h,k,m,n}
#mount /dev/disk/by-label/bcfs /mnt && df -h /mnt && umount /mnt
Filesystem                                                                                 Size  Used Avail Use% Mounted on
/dev/sdb:/dev/sdc:/dev/sdd:/dev/sde:/dev/sdf:/dev/sdg:/dev/sdh:/dev/sdk:/dev/sdm:/dev/sdn  6.7T   14M  6.6T   1% /mnt

However if i use any additional disk (sdo or sdp or both or sdo and sdp instead of sdn), I get an error on attempting to mount the result

#wipefs -a /dev/sd{b,c,d,e,f,g,h,k,m,n,o,p} 
#bcachefs format --fs_label=bcfs /dev/sd{b,c,d,e,f,g,h,k,m,n}
#mount /dev/disk/by-label/bcfs /mnt 
Error: Invalid argument

Additionally, there are two lines of error in dmesg:

[Sun Aug 25 01:33:26 2024] bcachefs (sdc): already have device online in slot 10
[Sun Aug 25 01:33:26 2024] bcachefs: bch2_mount() error: device_already_online

I can reproduce this consistently. I have performed each attempt (format & mount) also directly after a reboot to ensure a clean slate. I'll be happy to provide any additional information requested.

Is there any limits (e.g.: number of devices in play) i have overlooked? If so, should it let me format but not mount it? Neither the error message given by mount nor in dmesg seem really conclusive to me about the actual issue, can you give me pointers to look at?

KillerLink commented 2 weeks ago

Okay, my title is probably wrong, i spun up another server with 24 (more identical, 6x300GB+12x600GB), and had no issues creating and mount a filesystem over that amount of disks there. So currently I am at a loss for whats the issue above.