apache / kvrocks

Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
https://kvrocks.apache.org/
Apache License 2.0
3.47k stars 452 forks source link

[QUESTION] performance drops dramatically on hotkey scenario #494

Closed patpatbear closed 2 years ago

patpatbear commented 2 years ago

we are using zset to keep track of user score with a fixed length, but benchmark shows if zset is a hot key, performance could drop to 100 qps (or event to 10 qps).

I'm aware that this is caused by rocksdb iterator iterating and skipping too much deleted keys. I was wondering that if there are parameters to tweak to speedup a bit or workarounds?

following is the script to reproduce the performance drop:

#/bin/bash

for ((i = 0; i < 100000000; i++)); do

    /usr/bin/redis-benchmark -p 6666 -c 1 -n 10000000 -r 5000 zadd z __rand_int__ __rand_int__ > ./logs/zadd-$i.log 2>&1 &
    bench_zadd_pid=$!
    /usr/bin/redis-benchmark -p 6666 -c 1 -n 10000000 zremrangebyrank z 500 -1 > ./logs/zremrangebyrank-$i.log 2>&1 &
    bench_zrem_pid=$!

    echo "$(date "+%F %T") =$i=: bench with zadd($bench_zadd_pid)/zrem($bench_zrem_pid)"

    wait $bench_zadd_pid
    wait $bench_zrem_pid

done
git-hulk commented 2 years ago

@patpatbear Thanks for your feedback, which kvrocks version you're using? This PR: https://github.com/KvrocksLabs/kvrocks/pull/438 should help in this case, but I didn't have a try on my side.

patpatbear commented 2 years ago

also note that hash/set performance also drops on similar hot key senario.

git-hulk commented 2 years ago

Yes, too many tombstones would cause performance degrade when seeking the LSM Tree. For hash/set should only affect the command like hgetall, point lookup command like hget would be fine.

patpatbear commented 2 years ago

i update to the most recent version (2.0.6), seems like this issue still exists.

ppb@ppb-vm:~/test/kvrocks/logs$ tailf zremrangebyrank-0.log  
zremrangebyrank z 500 -1: 154.23 
git-hulk commented 2 years ago

oh.. what your mean was that deleted tombstones would slowdown zadd?

patpatbear commented 2 years ago

would slowdown zremrangebyrank. but both zremrangebyrank and zadd are write commands to the same key, so both zadd and zremrangebyrank would slowdown.

git-hulk commented 2 years ago

cool, thanks @patpatbear. For this case, the single hot key indeed affects the performance in data race condition since we have the key lock when writing. I would investigate it later.

git-hulk commented 2 years ago

I inspected this case, upper/lower bound help nothing since deletion tombstones may have the same prefix(same version). I'm not sure delete range whether can reduce the seek time or not when have many deletion tombstones. Do you have any thoughts except force compaction. @shangxiaoxiong @ShooterIT

ShooterIT commented 2 years ago

Some ideas

ShooterIT commented 2 years ago

@patpatbear would you like to try my PR #508, i enable rocksdb prefix bloom filter

ShooterIT commented 2 years ago

another way is that we can compact deleted keys or tombstones ASAP

ShooterIT commented 2 years ago

optimize current locks guard(use rocksdb snapshot for reading instead of locks), i am thinking, and i will discuss with @git-hulk soon after

Maybe i have a wrong idea, currently, we already used snapshot for reading instead of locks @shangxiaoxiong correct me

tisonkun commented 2 years ago

Closed as stale. You can ask questions at the Discussions forum.