knorrie / python-btrfs

Python Btrfs module
GNU Lesser General Public License v3.0
112 stars 22 forks source link

btrfs-space-calculator crashes with raid5 #19

Closed arvidjaar closed 4 years ago

arvidjaar commented 5 years ago
bor@bor-Latitude-E5450:~/src/python-btrfs$ git describe HEAD
v11
bor@bor-Latitude-E5450:~/src/python-btrfs$ ./bin/btrfs-space-calculator -m raid0 -d raid5 3T 1T 2T
Traceback (most recent call last):
  File "./bin/btrfs-space-calculator", line 189, in <module>
    main()
  File "./bin/btrfs-space-calculator", line 158, in main
    usage = btrfs.fs_usage.FsUsage(fs)
  File "/home/bor/src/python-btrfs/bin/btrfs/fs_usage.py", line 449, in __init__
    dev_extent_length = btrfs.volumes.chunk_to_dev_extent_length(chunk)
  File "/home/bor/src/python-btrfs/bin/btrfs/volumes.py", line 212, in chunk_to_dev_extent_length
    return chunk_length_to_dev_extent_length(chunk.type, chunk.num_stripes, chunk.length)
  File "/home/bor/src/python-btrfs/bin/btrfs/volumes.py", line 192, in chunk_length_to_dev_extent_length
    dev_extent_length = raw_data_bytes // num_data_stripes
ZeroDivisionError: integer division or modulo by zero
bor@bor-Latitude-E5450:~/src/python-btrfs$ ./bin/btrfs-space-calculator -m raid0 -d raid6 3T 1T 2T
Target metadata profile: RAID0
Target data profile: RAID6
Mixed block groups: False
Total raw filesystem size: 5.46TiB
Device sizes:
  Device 1: 2.73TiB
  Device 2: 931.32GiB
  Device 3: 1.82TiB
Metadata to data ratio: 1:200
Estimated virtual space to use for metadata: 5.00GiB
Estimated virtual space to use for data: 929.66GiB
Total unallocatable raw amount: 2.73TiB
Unallocatable raw bytes per device:
  Device 1: 1.82TiB
  Device 2: 0.00B
  Device 3: 931.32GiB
bor@bor-Latitude-E5450:~/src/python-btrfs$ 

It does not matter whether I use raid5 for data or metadata. Other profiles seem to work after quick testing.

knorrie commented 5 years ago

Thanks for the report.

The way this space calculator works is a little bit naughty. By constructing some fake replacements for the official Chunk, Stripe etc classes, we fool the usage report module and have it believe it's looking at an actual filesystem. The input in those fake objects is rather simple, and not completely correct. (A 1 byte chunk etc...) This is fragile, because FakeChunk has num_stripes 1 and then RAID5 subtracts the amount of parity, then it's 0 and BOOM. Funny enough, raid6 makes it end up as -1.

I just looked back at my bash history, grepping for all the options I ever used while writing this, and I see a lot of raid6, raid1 and other things, but no raid5. Great.

I guess starting to add proper tests will full code coverage and testing a lot of inputs on scripts is not a luxury at this point. I might start doing that during this year, but I'll first continue on the turorial.

Anyway, this can be quick fixed by just putting any number >=3 in that place, but it doesn't feel right, because this type of bug will appear in the future again.

The other way to fix it, is to have the usage report system accept some extra options, like forcibly setting target profiles (this is why the fake chunk exists, to have it detect target profile).

knorrie commented 5 years ago

I think I have an acceptable fix here now, that adds providing data_metadata_ratio (whoa, another bug, that -r didn't have any effect at all?) and target profiles for the simulation. I'll clean up and push it tomorrow. It removed all FakeStuff except for the FakeFileSystem which needs to have FakeDevItem of the right sizes.

Fun fact is that this also makes it possible to point the usage code to a real running filesystem and then tell it to predict what would happen when converting it to another profile (like, if it would be possible at all).

knorrie commented 5 years ago

I pushed something to develop branch. Please try it. :)

arvidjaar commented 5 years ago
 15 1T 2T 3T
Error: When using mixed groups, metadata and data profile need to be identical.
bor@bor-Latitude-E5450:~/src/python-btrfs$ ./bin/btrfs-space-calculator -m raid5 -d raid5 -M -r 15 1T 2T 3T
Traceback (most recent call last):
  File "./bin/btrfs-space-calculator", line 167, in <module>
    main()
  File "./bin/btrfs-space-calculator", line 137, in main
    target_profile_mixed=metadata_profile,
  File "/home/bor/src/python-btrfs/bin/btrfs/fs_usage.py", line 559, in __init__
    self._simulate_chunk_allocations(device_sizes)
  File "/home/bor/src/python-btrfs/bin/btrfs/fs_usage.py", line 735, in _simulate_chunk_allocations
    chunk_size = self._alloc_chunk(_sizes, flags)
  File "/home/bor/src/python-btrfs/bin/btrfs/fs_usage.py", line 674, in _alloc_chunk
    raise ValueError("Only DATA and METADATA supported here")
ValueError: Only DATA and METADATA supported here
bor@bor-Latitude-E5450:~/src/python-btrfs$ git describe
v11-6-g0e7a393
bor@bor-Latitude-E5450:~/src/python-btrfs$ 
knorrie commented 5 years ago

Yes, this clearly needs proper tests that try all combinations of input.

knorrie commented 5 years ago

FYI, because this is just too ugly to be true, I included it in an unblock request for Debian Buster: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=925306

Fingers crossed! :)

knorrie commented 5 years ago

By the way: the fix for Buster on top of v11 is 2 lines, not the >120 lines future proof change that's in develop branch.