tikv / tikv

Distributed transactional key-value database, originally created to complement TiDB
https://tikv.org
Apache License 2.0
15.35k stars 2.14k forks source link

High apply wait tail latency #16016

Open Connor1996 opened 1 year ago

Connor1996 commented 1 year ago

image image As you can see, the append log duration and apply log duration is 100ms at most, whereas, the apply wait duration is quite high reaching 5s. Indeed, it was a write hotspot at the time.

The write flow of append and apply should be nearly the same. As no slow in raft append process, raft apply wait shouldn't be that large.

image image

Connor1996 commented 1 year ago

It turns out the io flow difference between raft-engine and kvdb. As raft-engine performs lz4 compression on write while kvdb doesn't enable wal compression, the size after compression would be quite small if the data of high rate of repetition image image

LykxSassinator commented 11 months ago

Wait for enhancements in the later work.

tonyxuqqi commented 10 months ago

cc @v01dstar After rocksdb is upgraded and WAL compression is supported, this issue should be mitigated.

mzygQAQ commented 6 months ago

@tonyxuqqi @Connor1996 May I ask, what are the possible reasons for high apply wait? I have a TikV cluster here, and the applyWaitDuration P99 can reach 3-5 seconds, but the applyDuration not slow, (1-2ms P99)  The number of threads in the apply pool is 4, and I have checked their CPU consumption, which is approximately 20% per thread. The machine level CPU idle is also very high.  I am unable to identify the cause and make improvements. I checked the code and found that the apply wait duration measures the time a Committed Entry spends in the Apply BatchSystem (pending in the queue), but I don't know why it's so slow. Can you help me, thanks !

LykxSassinator commented 6 months ago

There are several clues for u to do the further investigation:

mzygQAQ commented 6 months ago

@LykxSassinator In my scenario, apply is fast, but apply_wait is high. it think that apply_wait be the time it stays in the queue? Shouldn't it involve the Rocksdb Mutex and load entry eviced what you mentioned?

LykxSassinator commented 5 months ago

@guoxiangCN Pls check the others mentioned above.

LykxSassinator commented 2 months ago

https://github.com/tikv/tikv/pull/17408 is made to mitigate the issue where apply wait latency is high when merging numerous small regions.