westerndigitalcorporation / zenfs

ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
GNU General Public License v2.0
238 stars 87 forks source link

Illegal instruction when running in Alpine Linux #197

Closed metaspace closed 2 years ago

metaspace commented 2 years ago

When running RocksDB 6.29.5 db_bench with ZenFS 2.0 in Alpine Linux, the program terminates with an illegal instruction exception.

To reproduce, build a docker image from this dockerfile and run it:

podman run --rm -v "/dev/nullb0:/dev/nullb0" -v "/home/aeh/src/zbdbench/zbdbench_results/2022-05-19-151248:/output"  --entrypoint sh -it zrocksdb

db_bench --fs_uri=zenfs://dev:nullb0 --key_size=20 --value_size=800 --target_file_size_base=126877696 --write_buffer_size=2147483648 --max_bytes_for_level_base=4294967296 --max_bytes_for_level_multiplier=4 --max_background_jobs=8 --max_background_compactions=8 --use_direct_io_for_flush_and_compaction --stats_dump_period_sec=15 --delete_obsolete_files_period_micros=3000000 --statistics --benchmarks=fillrandom,stats --num=1650000000
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
RocksDB:    version 6.29
Date:       Mon May 23 10:55:25 2022
CPU:        32 * AMD Ryzen 9 5950X 16-Core Processor
CPUCache:   512 KB
Keys:       20 bytes each (+ 0 bytes user-defined timestamp)
Values:     800 bytes each (400 bytes after compression)
Entries:    1650000000
Prefix:    0 bytes
Keys per prefix:    0
RawSize:    1290321.4 MB (estimated)
FileSize:   660896.3 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
WARNING: Assertions are enabled; benchmarks unnecessarily slow
------------------------------------------------
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
DB path: [rocksdbtest/dbbench]
Illegal instruction (core dumped)

The problem is not present with ZenFS 1.0.2.

Backtrace:

543         EncodeFixed64(sparse_buffer, extent_length);
(gdb) bt
#0  rocksdb::ZoneFile::SparseAppend (this=0x7f70a016a1c0, sparse_buffer=0x7f709fc05000 "\370\357\005", data_size=<optimized out>)
    at plugin/zenfs/fs/io_zenfs.cc:543
#1  0x0000556b85825da9 in rocksdb::ZonedWritableFile::FlushBuffer (this=this@entry=0x7f70a014eb40) at plugin/zenfs/fs/io_zenfs.cc:909
#2  0x0000556b85825e90 in rocksdb::ZonedWritableFile::BufferedWrite (this=0x7f70a014eb40, slice=...) at plugin/zenfs/fs/io_zenfs.cc:934
#3  0x0000556b85826fa8 in rocksdb::ZonedWritableFile::Append (this=0x7f70a014eb40, data=...) at plugin/zenfs/fs/io_zenfs.cc:968
#4  0x0000556b8568445d in rocksdb::WritableFileWriter::WriteBuffered (this=0x7f70a016a480, data=<optimized out>, size=<optimized out>)
    at file/writable_file_writer.cc:517
#5  0x0000556b85687788 in rocksdb::WritableFileWriter::Flush (this=0x7f70a016a480) at ./util/aligned_buffer.h:113
#6  0x0000556b855a0c38 in rocksdb::log::Writer::AddRecord (this=0x7f70a0168f70, slice=...) at /usr/include/c++/10.3.1/bits/unique_ptr.h:173
#7  0x0000556b85542d67 in rocksdb::DBImpl::WriteToWAL (this=0x7f70a01550c0, merged_batch=..., log_writer=<optimized out>, log_used=0x0,
    log_size=0x7f709ff527a8, with_db_mutex=false, with_log_mutex=false) at db/db_impl/db_impl_write.cc:1113
#8  0x0000556b85544709 in rocksdb::DBImpl::WriteToWAL (this=0x7f70a01550c0, write_group=..., log_writer=0x7f70a0168f70, log_used=0x0,
    need_log_sync=<optimized out>, need_log_dir_sync=false, sequence=12966520) at db/db_impl/db_impl_write.cc:1155
#9  0x0000556b85549cf5 in rocksdb::DBImpl::PipelinedWriteImpl (this=0x7f70a01550c0, write_options=..., my_batch=<optimized out>,
    callback=<optimized out>, log_used=<optimized out>, log_ref=<optimized out>, disable_memtable=<optimized out>, seq_used=<optimized out>)
    at db/db_impl/db_impl_write.cc:568
#10 0x0000556b8554c233 in rocksdb::DBImpl::WriteImpl (this=0x7f70a01550c0, write_options=..., my_batch=0x7f709ff53bd0, callback=0x0, log_used=0x0,
    log_ref=0, disable_memtable=false, seq_used=0x0, batch_cnt=0, pre_release_callback=0x0) at db/db_impl/db_impl_write.cc:157
#11 0x0000556b8554df81 in rocksdb::DBImpl::Write (this=<optimized out>, write_options=..., my_batch=<optimized out>) at db/db_impl/db_impl_write.cc:54
#12 0x0000556b85493583 in rocksdb::Benchmark::DoWrite (this=0x7ffc9325e420, thread=0x7f709ffcf0b0, write_mode=rocksdb::Benchmark::RANDOM)
    at tools/db_bench_tool.cc:5154
#13 0x0000556b8546e90a in rocksdb::Benchmark::ThreadBody (v=0x7f709ffb9d80) at tools/db_bench_tool.cc:3736
#14 0x0000556b856457e5 in rocksdb::(anonymous namespace)::StartThreadWrapper (arg=0x7f70a0110470) at env/env_posix.cc:447
#15 0x00007f70a05b7221 in ?? () from /lib/ld-musl-x86_64.so.1
#16 0x0000000000000000 in ?? ()

Git Bisect

Shows that this is the commit that introduce the problem:

dd926ac8d9bec8cf4ce05076bc98d03a70d1f2e9 is the first bad commit
commit dd926ac8d9bec8cf4ce05076bc98d03a70d1f2e9
Author: Hans Holmberg <hans.holmberg@wdc.com>
Date:   Wed Oct 6 14:13:04 2021 +0000

    fs: Split buffered and non-buffered appends

    Make buffered writes sparse.

    Now that we store the extent lengths in line with data
    for every buffer flush, we don't have to write metadata records
    when asked to do a data sync.

    Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>

 fs/fs_zenfs.cc |   7 +-
 fs/io_zenfs.cc | 234 +++++++++++++++++++++++++++++++++++----------------------
 fs/io_zenfs.h  |  14 +++-
 3 files changed, 163 insertions(+), 92 deletions(-)
metaspace commented 2 years ago

Issue is resolved by #198