yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.99k stars 1.07k forks source link

[DocDB] Slow reads from SST files should provide feedback loop to indicate the need for compactions. #13632

Open rthallamko3 opened 2 years ago

rthallamko3 commented 2 years ago

Jira Link: DB-3198

Description

Slow reads from SST files should provide feedback loop to indicate the need for compactions.

While scheduled background compactions can help with it, it would be good to improve compactions based on the feedback.

jmeehan16 commented 2 years ago

Relevant internal slack thread

Basic gist is that our compaction strategy (Universal compaction with one level) is optimized for insert- and read-heavy workloads. In workloads with a lot of deletes or updates, we can end up holding onto a lot of data that has been tombstoned, resulting in a lot of extra key lookups across files in RocksDB.

We can recognize when we're doing a lot of extra work on reads by comparing the number of key lookups in RocksDB against the number of records read by a YSQL query. There are a number of stats on the RocksDB side related to key deletions during compaction. There may be something similar for key lookups - if not, that would be easy to add.

This type of comparison could be used to determine when would be a good time to schedule a full compaction 7614