Closed Dikay900 closed 8 months ago
Evacuation + remove of the device hdd.hdd3 (device 2): sdk is done.
As you can see the foreground devices have now even more user data which are not getting rebalanced or moved. Rebalance is idle as well.
New data written to the filesystem is handled correctly.
~# bcachefs fs usage -h /srv/bcachefs
Filesystem: f0605888-9e17-4fec-bef6-92e885adc371
Size: 33.9 TiB
Used: 24.0 TiB
Online reserved: 496 KiB
Data type Required/total Devices
reserved: 1/1 [] 16.4 MiB
btree: 1/2 [sde sda] 14.1 GiB
btree: 1/2 [sdl sdf] 7.81 GiB
btree: 1/2 [sdm sdl] 20.9 GiB
btree: 1/2 [sde sdf] 14.9 GiB
btree: 1/2 [sda sdl] 20.7 GiB
btree: 1/2 [sde sdm] 4.04 GiB
btree: 1/2 [sde sdl] 2.48 GiB
btree: 1/2 [sdm sda] 42.4 GiB
btree: 1/2 [sdm sdf] 13.7 GiB
btree: 1/2 [sda sdf] 60.5 GiB
user: 1/2 [sdm sda] 8.71 GiB
user: 1/2 [sde sdh] 46.6 GiB
user: 1/2 [sdj sdi] 637 GiB
user: 1/2 [sdb sdj] 1.10 TiB
user: 1/2 [sda sdh] 94.6 GiB
user: 1/2 [sdc sdj] 1.10 TiB
user: 1/2 [sdg sdj] 2.30 TiB
user: 1/2 [sdm sdi] 60.1 GiB
user: 1/2 [sdl sdh] 185 GiB
user: 1/2 [sdi sdf] 12.3 GiB
user: 1/2 [sde sdm] 4.09 GiB
user: 1/2 [sdc sdg] 1.34 TiB
user: 1/2 [sdb sdg] 1.34 TiB
user: 1/2 [sdg sdm] 268 GiB
user: 1/2 [sdg sdi] 796 GiB
user: 1/2 [sdm sdj] 184 GiB
user: 1/2 [sda sdl] 3.54 GiB
user: 1/2 [sda sdf] 1.62 GiB
user: 1/2 [sdl sdf] 3.64 GiB
user: 1/2 [sdh sdi] 948 GiB
user: 1/2 [sde sdb] 25.6 GiB
user: 1/2 [sde sdl] 1.64 GiB
user: 1/2 [sde sdf] 948 MiB
user: 1/2 [sdc sda] 38.7 GiB
user: 1/2 [sdc sdi] 370 GiB
user: 1/2 [sdb sda] 39.1 GiB
user: 1/2 [sdb sdi] 373 GiB
user: 1/2 [sdg sdl] 115 GiB
user: 1/2 [sdg sdh] 4.68 TiB
user: 1/2 [sdg sdf] 81.4 GiB
user: 1/2 [sdm sdl] 24.0 GiB
user: 1/2 [sdm sdh] 439 GiB
user: 1/2 [sdm sdf] 9.13 GiB
user: 1/2 [sda sdj] 62.7 GiB
user: 1/2 [sda sdi] 12.1 GiB
user: 1/2 [sdl sdj] 77.4 GiB
user: 1/2 [sdl sdi] 24.6 GiB
user: 1/2 [sdj sdh] 2.37 TiB
user: 1/2 [sdj sdf] 64.9 GiB
user: 1/2 [sdh sdf] 98.0 GiB
user: 1/2 [sde sdc] 25.0 GiB
user: 1/2 [sde sdg] 50.9 GiB
user: 1/2 [sde sda] 920 MiB
user: 1/2 [sde sdj] 45.9 GiB
user: 1/2 [sde sdi] 8.77 GiB
user: 1/2 [sdc sdb] 666 GiB
user: 1/2 [sdc sdm] 119 GiB
user: 1/2 [sdc sdl] 50.2 GiB
user: 1/2 [sdc sdh] 1.59 TiB
user: 1/2 [sdc sdf] 39.9 GiB
user: 1/2 [sdb sdm] 122 GiB
user: 1/2 [sdb sdl] 50.6 GiB
user: 1/2 [sdb sdh] 1.61 TiB
user: 1/2 [sdb sdf] 40.2 GiB
user: 1/2 [sdg sda] 78.5 GiB
cached: 1/1 [sdg] 848 GiB
cached: 1/1 [sdj] 801 GiB
cached: 1/1 [sdc] 628 GiB
cached: 1/1 [sda] 135 GiB
cached: 1/1 [sdi] 18.2 MiB
cached: 1/1 [sde] 24.7 GiB
cached: 1/1 [sdb] 618 GiB
cached: 1/1 [sdm] 214 GiB
cached: 1/1 [sdl] 98.7 GiB
cached: 1/1 [sdh] 598 GiB
cached: 1/1 [sdf] 139 GiB
hdd.hdd2 (device 1): sdc rw
data buckets fragmented
free: 0 B 298109
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 0 B 0
user: 2.65 TiB 5577701 4.77 GiB
cached: 628 GiB 1746886
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 3.64 TiB 7630895
hdd.hdd4 (device 3): sdb rw
data buckets fragmented
free: 0 B 298109
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 0 B 0
user: 2.67 TiB 5606137 4.77 GiB
cached: 618 GiB 1718450
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 3.64 TiB 7630895
hdd.hdd5 (device 4): sdg rw
data buckets fragmented
free: 0 B 298094
sb: 3.00 MiB 4 1020 KiB
journal: 8.00 GiB 8192
btree: 0 B 0
user: 5.51 TiB 5795569 14.9 GiB
cached: 848 GiB 1528996
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 7.28 TiB 7630855
hdd.hdd6 (device 8): sdj rw
data buckets fragmented
free: 0 B 223574
sb: 3.00 MiB 4 1020 KiB
journal: 8.00 GiB 8192
btree: 0 B 0
user: 3.96 TiB 4161843 9.14 GiB
cached: 801 GiB 1329523
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 5.46 TiB 5723136
hdd.hdd7 (device 9): sdh rw
data buckets fragmented
free: 0 B 298096
sb: 3.00 MiB 4 1020 KiB
journal: 8.00 GiB 8192
btree: 0 B 0
user: 6.01 TiB 6317470 17.5 GiB
cached: 598 GiB 1007093
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 7.28 TiB 7630855
hdd.hdd8 (device 10): sdi rw
data buckets fragmented
free: 0 B 4052403
sb: 3.00 MiB 4 1020 KiB
journal: 8.00 GiB 8192
btree: 0 B 0
user: 1.58 TiB 1662522 2.19 GiB
cached: 18.2 MiB 15
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 5.46 TiB 5723136
ssd.ssd1 (device 5): sdm rw
data buckets fragmented
free: 0 B 79505
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 40.5 GiB 99164 7.89 GiB
user: 619 GiB 1270162 1.25 GiB
cached: 214 GiB 450709
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 932 GiB 1907739
ssd.ssd2 (device 7): sdl rw
data buckets fragmented
free: 0 B 124249
sb: 3.00 MiB 7 508 KiB
journal: 3.64 GiB 7452
btree: 25.9 GiB 63153 4.90 GiB
user: 268 GiB 550696 1.19 GiB
cached: 98.7 GiB 208322
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 1
erasure coded: 0 B 0
capacity: 466 GiB 953880
ssd.ssd3 (device 11): sdf rw
data buckets fragmented
free: 0 B 1118365
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 48.5 GiB 123285 11.7 GiB
user: 176 GiB 361112 244 MiB
cached: 139 GiB 296778
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 932 GiB 1907739
ssd.ssd4 (device 6): sda rw
data buckets fragmented
free: 0 B 1089934
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 68.8 GiB 172112 15.2 GiB
user: 170 GiB 349195 239 MiB
cached: 135 GiB 288299
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 932 GiB 1907739
ssd.ssd5 (device 0): sde rw
data buckets fragmented
free: 0 B 1588931
sb: 3.00 MiB 7 508 KiB
journal: 4.00 GiB 8192
btree: 17.8 GiB 44061 3.75 GiB
user: 105 GiB 215612 83.8 MiB
cached: 24.7 GiB 50936
parity: 0 B 0
stripe: 0 B 0
need_gc_gens: 0 B 0
need_discard: 0 B 0
erasure coded: 0 B 0
capacity: 932 GiB 1907739
~# cat /sys/fs/bcachefs/f0605888-9e17-4fec-bef6-92e885adc371/internal/rebalance_status
waiting
io wait duration: 14.6 MiB
io wait remaining: 11.4 MiB
duration waited: 13634 h
closing this as I cannot reproduce this on a fresh created fs regardless of replicas / compression / target options.
Please search for duplicates
Found no other problem with "evacuate" which describes my problem.
Version I am using bcachefs.git master commit e2cd10f254dd bcachefs-tools.git master commit c6e69549288
Generic info Provide the output of:
Problem
While evacuating a disk with 2TB of user data I saw that all disks are getting filled regardless whether they are background or foreground devices.
bcachefs evacuate /dev/sdk
-> hdd.hdd3 (device 2): sdkAfter some minutes i got the following usage:
See below for fs usage after removal / evacuation of the device.
Seeing that ssd.ssd[1-5] is getting user data (The filesystem was idle except for the evacuation) Looking into iostat it seems like every device gets data based on free space (as all were background devices)
The user data on ssd.ssd1 and ssd.ssd2 was from another evacuation where they were the only two ssd.* devices. The user data on these devices is not getting rebalanced into the background after finish of the evacuation as well as remounting / relabeling the devices
I think this behaviour started when upgrading to v1.3 the rebalance work. I would argue that it was working fine with an earlier version.
EDIT: removed the fs usage and posted a new one below after complete evacuation