westerndigitalcorporation / zenfs

ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
GNU General Public License v2.0
235 stars 86 forks source link

A Potential Bug In Garbage Collection #244

Closed attack204 closed 1 year ago

attack204 commented 1 year ago

Firstly I cloned the RocksDB from the official repo and garbage_collection branch from here

I just modified the "GC_START_LEVEL" from the default 20 to 70. But when I use the following script to test, an error happens.

The Error is

image

My test script

sudo ./db_bench -num=5000000  -key_size=16 -value_size=100 -max_bytes_for_level_base=1048576 -target_file_size_base=1048576 -write_buffer_size=4194304 writable_file_max_buffer_size=536870912  -max_bytes_for_level_multiplier=5 -max_background_compactions=1 -max_background_flushes=1 -max_background_jobs=1 -soft_pending_compaction_bytes_limit=1048576 -hard_pending_compaction_bytes_limit=1048576 -level0_stop_writes_trigger=12 -level0_slowdown_writes_trigger=8 -level0_file_num_compaction_trigger=4 -max_write_buffer_number=1  -threads=1 -compaction_pri=4 -open_files=1000 -target_file_size_multiplier=1 --fs_uri=zenfs://dev:nullb0 --benchmarks=fillrandom --use_direct_io_for_flush_and_compaction > out.txt

And I use null_blk as a simulator, the script is

sudo sh nullblk-zoned.sh 512 16 16 8 8 0 32

with the nullblk-zoned.sh

#!/bin/bash

if [ $# != 7 ]; then
        echo "Usage: $0 <sect size (B)> <zone size (MB)> <zone capacity (MB)> <max open zones> <max active zones> <nr conv zones> <nr seq zones>"
        exit 1
fi

scriptdir=$(cd $(dirname "$0") && pwd)

modprobe null_blk nr_devices=0 || return $?

function create_zoned_nullb()
{
        local nid=0
        local bs=$1
        local zs=$2
    local zc=$3
    local max_open_zones=$4
    local max_active_zones=$5
        local nr_conv=$6
        local nr_seq=$7

        cap=$(( zs * (nr_conv + nr_seq) ))

        while [ 1 ]; do
                if [ ! -b "/dev/nullb$nid" ]; then
                        break
                fi
                nid=$(( nid + 1 ))
        done

        dev="/sys/kernel/config/nullb/nullb$nid"
        mkdir "$dev"

        echo $bs > "$dev"/blocksize
        echo 0 > "$dev"/completion_nsec
        echo 0 > "$dev"/irqmode
        echo 2 > "$dev"/queue_mode
        echo 1024 > "$dev"/hw_queue_depth
        echo 1 > "$dev"/memory_backed
        echo 1 > "$dev"/zoned

        echo $cap > "$dev"/size
        echo $zs > "$dev"/zone_size
    echo $zc > "$dev"/zone_capacity
    echo $max_open_zones > "$dev"/zone_max_open
    echo $max_active_zones > "$dev"/zone_max_active
        echo $nr_conv > "$dev"/zone_nr_conv

        echo 1 > "$dev"/power

        echo mq-deadline > /sys/block/nullb$nid/queue/scheduler

        echo "$nid"
}

nulldev=$(create_zoned_nullb $1 $2 $3 $4 $5 $6 $7)
echo "Created /dev/nullb$nulldev"
yhr commented 1 year ago

@attack204 : thanks for reporting! I've run across this issue as well . The assertion is wrong.

We must hold the write lock, so we should check for that in stead.

You can fix this by just changing:

assert(!IsOpenForWR() && new_list.size() > 0); to assert(IsOpenForWR() && new_list.size() > 0);

I have a bunch of small GC fixes queued up that i'll push for review later this week.

attack204 commented 1 year ago

Thank you!