kdave / btrfs-progs

Development of userspace BTRFS tools
GNU General Public License v2.0
527 stars 239 forks source link

BTRFS should grow a RAID1 filesystem when replacing smaller drives with bigger ones #21

Open basic6 opened 7 years ago

basic6 commented 7 years ago

Suppose you have a BTRFS RAID1 filesystem with 4 drives, 3 GB each, 6 GB capacity:

# mkfs.btrfs -f -draid1 -mraid1 /dev/sdb /dev/sdc /dev/sdd /dev/sde >/dev/null 
# mount /dev/sdb BTRFS/
# btrfs fi show BTRFS/
Label: none  uuid: e6dc6a95-ae5e-49c4-bded-77001b445ac7
    Total devices 4 FS bytes used 192.00KiB
    devid    1 size 3.00GiB used 331.12MiB path /dev/sdb
    devid    2 size 3.00GiB used 0.00B path /dev/sdc
    devid    3 size 3.00GiB used 0.00B path /dev/sdd
    devid    4 size 3.00GiB used 0.00B path /dev/sde

# parted -s /dev/sdb print | grep Disk
Disk /dev/sdb: 3221MB
Disk Flags: 
# parted -s /dev/sdc print | grep Disk
Disk /dev/sdc: 3221MB
Disk Flags: 
# parted -s /dev/sdd print | grep Disk
Disk /dev/sdd: 3221MB
Disk Flags: 
# parted -s /dev/sde print | grep Disk
Disk /dev/sde: 3221MB
Disk Flags: 
# df -h BTRFS/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb        6.0G   17M  5.3G   1% /mnt/BTRFS

After replacing 2 3G drives with 4G drives, you should have a new total capacity of 7 GB...

# parted -s /dev/sdf print | grep Disk
Disk /dev/sdf: 4295MB
Disk Flags: 
# parted -s /dev/sdg print | grep Disk
Disk /dev/sdg: 4295MB
Disk Flags: 
# btrfs replace start -f 3 /dev/sdf BTRFS/
# btrfs replace start -f 4 /dev/sdg BTRFS/
# btrfs fi show BTRFS/
Label: none  uuid: e6dc6a95-ae5e-49c4-bded-77001b445ac7
    Total devices 4 FS bytes used 512.00KiB
    devid    1 size 3.00GiB used 1.28GiB path /dev/sdb
    devid    2 size 3.00GiB used 1.25GiB path /dev/sdc
    devid    3 size 3.00GiB used 1.06GiB path /dev/sdf
    devid    4 size 3.00GiB used 544.00MiB path /dev/sdg

# df -h BTRFS/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb        6.0G   17M  5.2G   1% /mnt/BTRFS

... but you don't. The filesystem still has its initial capacity of 6 GB instead of the expected 7 GB.

It is only after manually growing the filesystem on each replaced device that you get the full capacity of the drives:

# btrfs fi resize 3:max BTRFS/
Resize 'BTRFS/' of '3:max'
# btrfs fi resize 4:max BTRFS/
Resize 'BTRFS/' of '4:max'
# df -h BTRFS/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb        7.0G   17M  6.8G   1% /mnt/BTRFS

These extra steps should not be necessary. After replacing some drives, it looks like BTRFS is unable to make use of the new capacity when in fact some manual resize commands need to be called.

Tested with btrfs-progs 4.4.

kdave commented 7 years ago

Your observation is right, the manual step is now needed while it could be done in one go with the replace step. We'd have to add a new option to do that, but it's basically just calling one more ioctl after successful replace.

jnoxon commented 5 years ago

Thanks for sharing this. I had the same issue and had no idea I had to apply the resize operation to a device rather than the entire filesystem. That's completely non-intuitive.

mk279 commented 4 years ago

Sorry for resurrecting this thread, but my problem is similar. I had 3 x 8 TB disks: devid 1 size 7.28TiB used 7.20TiB path /dev/sdc devid 2 size 7.28TiB used 7.20TiB path /dev/sdg devid 3 size 7.28TiB used 7.20TiB path /dev/sdd

with around 286G free space

Then I

  1. added a 16TB disk, appearing as /dev/sde
  2. did "btrfs replace start /dev/sde /dev/sdg /mnt/BTRFS"
  3. did "btrfs fi resize 2:max /mnt/BTRFS/"

Result: devid 1 size 7.28TiB used 7.20TiB path /dev/sdc devid 2 size 14.55TiB used 7.20TiB path /dev/sde devid 3 size 7.28TiB used 7.20TiB path /dev/sdd So far so good. BUT: df -h shows a strange result:

/dev/sdc 30T 22T 293G 99% /mnt/BTRFS The overall size seems to be good. But, there should be around 286G + 8TB free space, right?

I already started a full balance, but will this help? 3233 out of about 7403 chunks balanced (3234 considered), 56% left

Kernel version: 4.19.0

Any hint appreciated.

marcosps commented 4 years ago

@basic6 I send some patches into the btrfs ML to implement this feature, let's hope it gets reviewed soon :)

marcosps commented 4 years ago

Sent a v2 of this same patch right now.

MurzNN commented 2 years ago

This problem is happened even without replacing, if I grow the underlying LVM lv, for example. I have a btrfs raid1 using 2 devices:

# btrfs filesystem show /var/backups/brick/brick-svs/mysql
Label: none  uuid: 551b2a31-ecaa-4e74-9051-c2ea388dc2ab
        Total devices 2 FS bytes used 640.00KiB
        devid    1 size 35.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd1-b_brick--svs_mysql
        devid    2 size 35.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd2-b_brick--svs_mysql

# df -h /var/backups/brick/brick-svs/mysql
/dev/mapper/brick--svs--hdd1-b_brick--svs_mysql   35G  3.4M   34G   1% /var/backups/brick/brick-svs/mysql

I do growing the first disk /dev/mapper/brick--svs--hdd1-b_brick--svs_mysql to +5G and resizing btrfs filesystem:

# lvextend -L+5G /dev/brick-svs-hdd1/b_brick-svs_mysql
# btrfs filesystem resize max /var/backups/brick/brick-svs/mysql
# btrfs filesystem show /var/backups/brick/brick-svs/mysql
Label: none  uuid: 551b2a31-ecaa-4e74-9051-c2ea388dc2ab
        Total devices 2 FS bytes used 640.00KiB
        devid    1 size 40.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd1-b_brick--svs_mysql
        devid    2 size 35.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd2-b_brick--svs_mysql

# df -h /var/backups/brick/brick-svs/mysql
Filesystem                                       Size  Used Avail Use% Mounted on
/dev/mapper/brick--svs--hdd1-b_brick--svs_mysql   38G  3.9M   34G   1% /var/backups/brick/brick-svs/mysql

And after resizing second drive btrfs still see device 2 as 35G:

# lvextend -L+5G /dev/brick-svs-hdd2/b_brick-svs_mysql
# btrfs filesystem resize max /var/backups/brick/brick-svs/mysql
# btrfs filesystem show /var/backups/brick/brick-svs/mysql
Label: none  uuid: 551b2a31-ecaa-4e74-9051-c2ea388dc2ab
        Total devices 2 FS bytes used 640.00KiB
        devid    1 size 40.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd1-b_brick--svs_mysql
        devid    2 size 35.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd2-b_brick--svs_mysql

# df -h /var/backups/brick/brick-svs/mysql
Filesystem                                       Size  Used Avail Use% Mounted on
/dev/mapper/brick--svs--hdd1-b_brick--svs_mysql   38G  3.9M   34G   1% /var/backups/brick/brick-svs/mysql

And only passing needed device id manually successfully resize the btrfs filesystem to new size:

# btrfs filesystem resize 2:max /var/backups/brick/brick-svs/mysql
Resize '/var/backups/brick/brick-svs/mysql' of '2:max'
# btrfs filesystem show /var/backups/brick/brick-svs/mysql
Label: none  uuid: 551b2a31-ecaa-4e74-9051-c2ea388dc2ab
        Total devices 2 FS bytes used 640.00KiB
        devid    1 size 40.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd1-b_brick--svs_mysql
        devid    2 size 40.00GiB used 2.03GiB path /dev/mapper/brick--svs--hdd2-b_brick--svs_mysql

# df -h /var/backups/brick/brick-svs/mysql
Filesystem                                       Size  Used Avail Use% Mounted on
/dev/mapper/brick--svs--hdd1-b_brick--svs_mysql   40G  3.9M   39G   1% /var/backups/brick/brick-svs/mysql

Will be good to improve this via resizing all drives on resize max command, because I spend a lot of time to understand why only first drive is resized successfully.

MurzNN commented 2 years ago

@marcosps, did you create a PR with your implementation? Can't find it in https://github.com/kdave/btrfs-progs/pulls?q=is%3Apr+

marcosps commented 2 years ago

@MurzNN the patch wasn't accepted, lack or review or maybe other reason.