westerndigitalcorporation / zenfs

ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
GNU General Public License v2.0
235 stars 86 forks source link

IO error: No space left on device… , but there are still some empty zones #271

Closed zhoulening969 closed 1 year ago

zhoulening969 commented 1 year ago

When I use dbbench for testing:

root@pm-pr4906w-83:/home/zln/rocksdb/plugin/zenfs/tests# ./zenfs_base_performance.sh nullb0
ZenFS file system created. Free space: 99328 MB
Using URI zenfs://dev:nullb0
Running ZenFS baseline performance tests, results will be stored in results/zenfs-nullb0-baseline_performance 
Running wl_test/0000_fillrandom.sh 
RocksDB:    version 7.10.2
Date:       Mon Jun 12 18:49:57 2023
CPU:        104 * Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz
CPUCache:   39936 KB
put error: IO error: No space left on device: Zone allocation failure

Test duration 0h 13m 12s
FAILED

1 TESTS FAILED

But after checking the device information:

root@pm-pr4906w-83:/home/zln/rocksdb/plugin/zenfs/tests# zbd report -i /dev/nullb0 
Device /dev/nullb0:
    Vendor ID: Unknown
    Zone model: host-managed
    Capacity: 107.374 GB (209715200 512-bytes sectors)
    Logical blocks: 26214400 blocks of 4096 B
    Physical blocks: 26214400 blocks of 4096 B
    Zones: 100 zones of 1024.0 MB
    Maximum number of open zones: no limit
    Maximum number of active zones: 14
.......
Zone 00059: swr, ofst 00063350767616, len 00001073741824, cap 00001073741824, wp 00063350767616, em, non_seq 0, reset 0
Zone 00060: swr, ofst 00064424509440, len 00001073741824, cap 00001073741824, wp 00065498251264, fu, non_seq 0, reset 0
Zone 00061: swr, ofst 00065498251264, len 00001073741824, cap 00001073741824, wp 00066571993088, fu, non_seq 0, reset 0
Zone 00062: swr, ofst 00066571993088, len 00001073741824, cap 00001073741824, wp 00067645734912, fu, non_seq 0, reset 0
Zone 00063: swr, ofst 00067645734912, len 00001073741824, cap 00001073741824, wp 00067645734912, em, non_seq 0, reset 0
Zone 00064: swr, ofst 00068719476736, len 00001073741824, cap 00001073741824, wp 00069471559680, cl, non_seq 0, reset 0
Zone 00065: swr, ofst 00069793218560, len 00001073741824, cap 00001073741824, wp 00070866960384, fu, non_seq 0, reset 0
Zone 00066: swr, ofst 00070866960384, len 00001073741824, cap 00001073741824, wp 00071940702208, fu, non_seq 0, reset 0
Zone 00067: swr, ofst 00071940702208, len 00001073741824, cap 00001073741824, wp 00073014444032, fu, non_seq 0, reset 0
Zone 00068: swr, ofst 00073014444032, len 00001073741824, cap 00001073741824, wp 00073014444032, em, non_seq 0, reset 0
........

There are still more than three empty zones, which means there is actually a lot of space left. So this error confused me a lot. The version I am using is rocksdbv7.10.2 + zenfs master. The parameters are as follows:

DEV=$1
CAP_SECTORS=$(blkzone report -c 5 /dev/$DEV | grep -oP '(?<=cap )[0-9xa-f]+' | head -1)
ZONE_CAP=$(($CAP_SECTORS * 512))
ZONE_LEVE=$(( 1024 * 1024 * 1024 * 4 ))
DOFP_MICRO=$(( 30 * 1000 * 1000 ))
echo "--write_buffer_size=$ZONE_CAP --target_file_size_base=$ZONE_CAP  --max_bytes_for_level_base=$ZONE_LEVE --use_direct_io_for_flush_and_compaction --max_bytes_for_level_multiplier=4 --max_background_jobs=8 --delete_obsolete_files_period_micros=$DOFP_MICRO --use_direct_reads"

# Common settings
NUM=170000000
KEY_SIZE=20
VALUE_SIZE=800
DB_BENCH_PARAMS="--benchmarks=fillrandom --num=$NUM --key_size=$KEY_SIZE --value_size=$VALUE_SIZE --histogram $FS_PARAMS $DB_BENCH_EXTRA_PARAMS"
zhoulening969 commented 1 year ago

I emulated a 100 GB ZNS SSD with one hundred 1 GB sized zones in DRAM using null_blk. The following parameters were set using the nullblk zoned. sh:

...
echo 0 > "$dev"/completion_nsec
echo 0 > "$dev"/irqmode
echo 2 > "$dev"/queue_mode
echo 1024 > "$dev"/hw_queue_depth
echo 1 > "$dev"/memory_backed
echo 1 > "$dev"/zoned
...

But the writing speed during the test was very slow. How should this problem be solved?

DB path: [rocksdbtest/dbbench]
fillrandom   :       4.350 micros/op 229891 ops/sec 652.483 seconds 150000000 operations;  179.8 MB/s
yhr commented 1 year ago

When I use dbbench for testing:

root@pm-pr4906w-83:/home/zln/rocksdb/plugin/zenfs/tests# ./zenfs_base_performance.sh nullb0
ZenFS file system created. Free space: 99328 MB
Using URI zenfs://dev:nullb0
Running ZenFS baseline performance tests, results will be stored in results/zenfs-nullb0-baseline_performance 
Running wl_test/0000_fillrandom.sh 
RocksDB:    version 7.10.2
Date:       Mon Jun 12 18:49:57 2023
CPU:        104 * Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz
CPUCache:   39936 KB
put error: IO error: No space left on device: Zone allocation failure

Test duration 0h 13m 12s
FAILED

1 TESTS FAILED

But after checking the device information:

root@pm-pr4906w-83:/home/zln/rocksdb/plugin/zenfs/tests# zbd report -i /dev/nullb0 
Device /dev/nullb0:
    Vendor ID: Unknown
    Zone model: host-managed
    Capacity: 107.374 GB (209715200 512-bytes sectors)
    Logical blocks: 26214400 blocks of 4096 B
    Physical blocks: 26214400 blocks of 4096 B
    Zones: 100 zones of 1024.0 MB
    Maximum number of open zones: no limit
    Maximum number of active zones: 14
.......
Zone 00059: swr, ofst 00063350767616, len 00001073741824, cap 00001073741824, wp 00063350767616, em, non_seq 0, reset 0
Zone 00060: swr, ofst 00064424509440, len 00001073741824, cap 00001073741824, wp 00065498251264, fu, non_seq 0, reset 0
Zone 00061: swr, ofst 00065498251264, len 00001073741824, cap 00001073741824, wp 00066571993088, fu, non_seq 0, reset 0
Zone 00062: swr, ofst 00066571993088, len 00001073741824, cap 00001073741824, wp 00067645734912, fu, non_seq 0, reset 0
Zone 00063: swr, ofst 00067645734912, len 00001073741824, cap 00001073741824, wp 00067645734912, em, non_seq 0, reset 0
Zone 00064: swr, ofst 00068719476736, len 00001073741824, cap 00001073741824, wp 00069471559680, cl, non_seq 0, reset 0
Zone 00065: swr, ofst 00069793218560, len 00001073741824, cap 00001073741824, wp 00070866960384, fu, non_seq 0, reset 0
Zone 00066: swr, ofst 00070866960384, len 00001073741824, cap 00001073741824, wp 00071940702208, fu, non_seq 0, reset 0
Zone 00067: swr, ofst 00071940702208, len 00001073741824, cap 00001073741824, wp 00073014444032, fu, non_seq 0, reset 0
Zone 00068: swr, ofst 00073014444032, len 00001073741824, cap 00001073741824, wp 00073014444032, em, non_seq 0, reset 0
........

There are still more than three empty zones, which means there is actually a lot of space left. So this error confused me a lot. The version I am using is rocksdbv7.10.2 + zenfs master. The parameters are as follows:

DEV=$1
CAP_SECTORS=$(blkzone report -c 5 /dev/$DEV | grep -oP '(?<=cap )[0-9xa-f]+' | head -1)
ZONE_CAP=$(($CAP_SECTORS * 512))
ZONE_LEVE=$(( 1024 * 1024 * 1024 * 4 ))
DOFP_MICRO=$(( 30 * 1000 * 1000 ))
echo "--write_buffer_size=$ZONE_CAP --target_file_size_base=$ZONE_CAP  --max_bytes_for_level_base=$ZONE_LEVE --use_direct_io_for_flush_and_compaction --max_bytes_for_level_multiplier=4 --max_background_jobs=8 --delete_obsolete_files_period_micros=$DOFP_MICRO --use_direct_reads"

# Common settings
NUM=170000000
KEY_SIZE=20
VALUE_SIZE=800
DB_BENCH_PARAMS="--benchmarks=fillrandom --num=$NUM --key_size=$KEY_SIZE --value_size=$VALUE_SIZE --histogram $FS_PARAMS $DB_BENCH_EXTRA_PARAMS"

Hi,

Files might have been deleted and zones reset as a result after the allocation error occurred. You can add an assert when allocation fails to force the process to exit at that point to double check this.

yhr commented 1 year ago

I emulated a 100 GB ZNS SSD with one hundred 1 GB sized zones in DRAM using null_blk. The following parameters were set using the nullblk zoned. sh:

...
echo 0 > "$dev"/completion_nsec
echo 0 > "$dev"/irqmode
echo 2 > "$dev"/queue_mode
echo 1024 > "$dev"/hw_queue_depth
echo 1 > "$dev"/memory_backed
echo 1 > "$dev"/zoned
...

But the writing speed during the test was very slow. How should this problem be solved?

DB path: [rocksdbtest/dbbench]
fillrandom   :       4.350 micros/op 229891 ops/sec 652.483 seconds 150000000 operations;  179.8 MB/s

Try disabling compression and / or increase value size.

zhoulening969 commented 1 year ago

I emulated a 100 GB ZNS SSD with one hundred 1 GB sized zones in DRAM using null_blk. The following parameters were set using the nullblk zoned. sh:

...
echo 0 > "$dev"/completion_nsec
echo 0 > "$dev"/irqmode
echo 2 > "$dev"/queue_mode
echo 1024 > "$dev"/hw_queue_depth
echo 1 > "$dev"/memory_backed
echo 1 > "$dev"/zoned
...

But the writing speed during the test was very slow. How should this problem be solved?

DB path: [rocksdbtest/dbbench]
fillrandom   :       4.350 micros/op 229891 ops/sec 652.483 seconds 150000000 operations;  179.8 MB/s

Try disabling compression and / or increase value size.

Okay, I'll give it a try. Thank you!