Open zaidoon1 opened 3 weeks ago
I ran the sst_dump tool to analyze the sst fils. I did it on a server that is also seeing the same memory pattern but has a much smaller db size to make dumping the data easier/faster:
sst_dump results: dump.txt
example:
# data blocks: 39433
# entries: 904072
# deletions: 0
# merge operands: 0
# range deletions: 0
raw key size: 260872449
raw average key size: 288.552736
raw value size: 0
raw average value size: 0.000000
data block size: 67110378
index block size (user-key? 1, delta-value? 1): 44931865
filter block size: 98293
# entries for filter: 127655
(estimated) table size: 112140536
filter policy name: ribbonfilter
prefix extractor name: uid_extractor
column family ID: 1
column family name: url
comparator name: leveldb.BytewiseComparator
user defined timestamps persisted: true
merge operator name: nullptr
property collectors names: []
SST file compression algo: ZSTD
SST file compression options: window_bits=-14; level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0; enabled=1; max_dict_buffer_bytes=0; use_zstd_dict_trainer=1;
creation time: 1712869978
time stamp of earliest key: 0
file creation time: 1724035417
slow compression estimated data size: 0
fast compression estimated data size: 0
looking at the SST files for the url cf, we see that the index block size
for each url sst file added together equates to the 300mb of memory used by the url cf so that tracks/makes sense.
Based on this, what options do I have to deal with large index block size? i guess changing the block size from 4KB to 16KB should cut the index block size by a factor of 4 if i'm understanding this correctly? But also in general, as the db size grows, how do we make sure that index block size doesn't result in OOM?
setting the block size to 16KB "solved" the problem for me but that's because my database size is small so it's a workaround more than a solution.
A lot of historical context can be found here: https://github.com/facebook/rocksdb/issues/12579
background, when I first started using rocksdb, I had the following options set:
Then I saw that rocksdb was maxing out memory/cpu usage and after opening the issue above and doing some investigation with the help of @ajkr . We realized it was caused by index/filter blocks thrashing. The workarounds that were proposed at the time https://github.com/facebook/rocksdb/issues/12579#issuecomment-2094349798 was:
and I went with just setting
cache_index_and_filter_blocks=false
. This was fine for many months until recently where rocksdb starting using up all memory again:As we can see from the last screenshot, the "url" cf index/filter blocks started using up a significant amount of memory, presumably the read patterns changed as we only see this on a few machines.
First, it's odd that the url cf is using this much to begin with even if we were loading all the index/filter blocks (due to not caching index and filter blocks in block cache). there is about 150 SST files for the url cf, i'm using a prefix extractor + ribbon filter with 10 bits and 302607723 kvs. the url cf size on disk is 8.34 GiB and yet index and filter blocks are somehow using 4.68 GB in memory and still going up? Something doesn't add up here?
second, let's assume this amount of usage is normal, what is the workaround here? Apparently I can't pin all index and filter blocks in memory because i don't have enough memory, but also I don't want to just enable cache_index_and_filter_blocks because I don't want to run into the thrashing issues that I saw in the previous issue linked. Is there specific rocksdb settings to solve both issues?
I was reading https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB#indexes-and-filter-blocks and it states:
Should i change block size from the default 4KB to 16KB?
OPTIONS.txt