bytedance / terarkdb

A RocksDB compatible KV storage engine with better performance
Apache License 2.0
2.05k stars 202 forks source link

There is no valid output when testing compact with db_bench tool. #37

Open FangleiLiu opened 3 years ago

FangleiLiu commented 3 years ago

[BUG]

Expected behavior

Recently, I'm testing terarkdb's performance with db_bench。When I execute the command "./db_bench --benchmarks=compact", the output is invalid. There should be positive output because the condition of compacting has been reached.

Actual behavior

But when I execute the command "./db_bench --benchmarks=compact --use_terark_table=false", the output looks correct. [liufanglei@node24 output]$ ./db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags RocksDB: version 5.18 Date: Mon Jan 18 09:06:24 2021 CPU: 80 * Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz CPUCache: 28160 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 Prefix: 0 bytes Keys per prefix: 0 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) Write rate: 0 bytes/second Read rate: 0 ops/second Compression: Snappy Memtablerep: skip_list Perf Level: 1 DB path: [/data4/liufl/rocksdb/_terarkdb/] compact : 2878433.000 micros/op 0 ops/sec;

[liufanglei@node24 output]$ ./db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ --use_terark_table=false Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags RocksDB: version 5.18 Date: Mon Jan 18 09:07:12 2021 CPU: 80 * Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz CPUCache: 28160 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 Prefix: 0 bytes Keys per prefix: 0 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) Write rate: 0 bytes/second Read rate: 0 ops/second Compression: Snappy Memtablerep: skip_list Perf Level: 1 DB path: [/data4/liufl/rocksdb/_terarkdb/] compact : 1445.000 micros/op 692 ops/sec;

Steps to reproduce the behavior

$$ git clone http:://xxx.terarkdb.git $$ ./build.sh, with -DCMAKE_BUILD_TYPE=Release -DWITH_TERARK_ZIP=ON $$ cd output $$ ./db_bench --benchmarks=fillrandom --use_existing_db=0 --disable_auto_compactions=1 --sync=0 --db=/data4/liufl/terarkdb/db/ --wal_dir=/data4/liufl/terarkdb/wal/ --num=1000000000 $$ ./output/db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ $$ ./output/db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ --use_terark_table=false

Something else

When I tested the performance of readrandom, I found that there was write bandwidth in the statistics。 It's strange beacase the script benchmark.sh has the setting "--disable_auto_compactions=1". Maybe it's my misjudgment.

My operation steps are as follows: $$ NUM_KEYS=100000000 NUM_THREADS=64 CACHE_SIZE=137438953472 DURATION=5400 ./benchmark.sh bulkload $$ NUM_KEYS=100000000 NUM_THREADS=64 CACHE_SIZE=137438953472 DURATION=5400 ./benchmark.sh readrandom

Layamon commented 3 years ago

compact : 2878433.000 micros/op 0 ops/sec; vs compact : 1445.000 micros/op 692 ops/sec;

you think this two value is invalid, especially the first one is 0 ops/sec?Am I right?

ustcwelcome commented 3 years ago

0 ops/sec is right. If you run "$$ ./db_bench --benchmarks=fillrandom --use_existing_db=0 --disable_auto_compactions=1 --sync=0 --db=/data4/liufl/terarkdb/db/ --wal_dir=/data4/liufl/terarkdb/wal/ --num=1000000000" , you should also run " $$ ./output/db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/". Both of which use terark-zip-table. If you want to run "$$ ./output/db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ --use_terark_table=false" , you should add --use_terark_table=false to fillrandom benchmark also before run it.

In fact compact benchmark has only one CompactRange operation, so its ops/sec is <1.0 and its output is 0. If you run bench(fillrandom + compact) with "-use_terark_table=false" both, the same output will you find.

About Readrandom: Maybe bulkload has some flush didn't accomplish, when you ran radrandom ,it kept on flushing so you found some write bandwidth. LOG file may anwser your question.

yapple commented 3 years ago

compact : 2878433.000 micros/op 0 ops/sec; vs compact : 1445.000 micros/op 692 ops/sec;

you think this two value is invalid, especially the first one is 0 ops/sec?Am I right?

in my opinion, the first global compact is slow, so the first output is 0ops/sec. the second global compact is fast because you have finished a global compact.

FangleiLiu commented 3 years ago

compact : 2878433.000 micros/op 0 ops/sec; vs compact : 1445.000 micros/op 692 ops/sec;

you think this two value is invalid, especially the first one is 0 ops/sec?Am I right?

Only the first one. Why is it so slow,no matter how much data is in the DB.

FangleiLiu commented 3 years ago

In fact compact benchmark has only one C

Thank you for your answer. Is the ops/sec of the CompactRange operation too slow?

About Readrandom: Thank you for your advice, I will try to find the anwer from the LOG file.

ustcwelcome commented 3 years ago

In fact compact benchmark has only one C

Thank you for your answer. Is the ops/sec of the CompactRange operation too slow?

About Readrandom: Thank you for your advice, I will try to find the anwer from the LOG file.

See db_bench_tool.cc line 1818 for its output, CompactRange is one opertation and this operation takes in a few seconds(it's normal because of high comsuming of compaction). For "compact : 1445.000 micros/op 692 ops/sec;" is not correct , because you use terark-zip-table on fillrandom and compact on block-based-table, you may find from LOG like this: "error: Corruption: Bad table magic number: expected 9863518390377041911, found 1234605616436508552", so the actual compact would not happen and end quickly(that's why ops/sec is so large). We will fix this bug later.

FangleiLiu commented 3 years ago

In fact compact benchmark has only one C

Thank you for your answer. Is the ops/sec of the CompactRange operation too slow? About Readrandom: Thank you for your advice, I will try to find the anwer from the LOG file.

See db_bench_tool.cc line 1818 for its output, CompactRange is one opertation and this operation takes in a few seconds(it's normal because of high comsuming of compaction). For "compact : 1445.000 micros/op 692 ops/sec;" is not correct , because you use terark-zip-table on fillrandom and compact on block-based-table, you may find from LOG like this: "error: Corruption: Bad table magic number: expected 9863518390377041911, found 1234605616436508552", so the actual compact would not happen and end quickly(that's why ops/sec is so large). We will fix this bug later.

Thanks for patiently answering my question. Please forgive me for not knowing terarkdb/rocksdb well enough. Refer to your guidance, I will continue to read terarkdb/rocksdb code.