Large random read db.Get request simultaneously cause OOM

jimmyyan commented 1 year ago

We use pebble as a backend storage of our service, but when large random read comes(invokded by db.Get()), the pebble will alloc large amount of memory cause the process OOM. we would wonder know if there have some options to control the memory usage.

we traced the heap memory using by pebble, large memory alloced from the cache.newValue() in readBlock function.

BTW，we have set the max size of cache to 3GB, and block size be set to 16KB.

Jira issue: PEBBLE-87

jimmyyan commented 1 year ago

[2023/10/31 11:18:58.027 +08:00] [INFO] [meta_engine_pebble.go:569] ["pebble engine metrics "] [fsId=f14mcarbnoqb] [Metrics=" | | | | ingested | moved | written | | amp | multilevel\nlevel | tables size val-bl vtables | score | in | tables size | tables size | tables size | read | r w | top in read\n------+-----------------------------+-------+-------+--------------+--------------+--------------+-------+----------+------------------\n 0 | 0 0B 0B 0 | 0.00 | 14GB | 0 0B | 0 0B | 286 14GB | 5.3GB | 0 1.0 | 0B 0B 0B\n 1 | 0 0B 0B 0 | 0.00 | 0B | 0 0B | 0 0B | 0 0B | 0B | 0 0.0 | 0B 0B 0B\n 2 | 0 0B 0B 0 | 0.00 | 0B | 0 0B | 0 0B | 0 0B | 0B | 0 0.0 | 0B 0B 0B\n 3 | 0 0B 0B 0 | 0.00 | 0B | 0 0B | 0 0B | 0 0B | 0B | 0 0.0 | 0B 0B 0B\n 4 | 1 67MB 0B 0 | 0.17 | 4.2GB | 0 0B | 1 65MB | 70 7.9GB | 7.9GB | 1 1.9 | 0B 0B 0B\n 5 | 10 1.4GB 0B 0 | 0.80 | 6.8GB | 0 0B | 11 1.0GB | 98 12GB | 12GB | 1 1.8 | 0B 0B 0B\n 6 | 31 7.0GB 0B 0 | - | 7.4GB | 0 0B | 0 0B | 93 18GB | 19GB | 1 2.5 | 746MB 3.2GB 9.0GB\ntotal | 42 8.4GB 0B 0 | - | 14GB | 0 0B | 12 1.1GB | 547 67GB | 44GB | 3 4.7 | 746MB 3.2GB 9.0GB\n---------------------------------------------------------------------------------------------------------------------------------------\nWAL: 1 files (0B) in: 0B written: 14GB (0% overhead)\nFlushes: 33\nCompactions: 130 estimated debt: 0B in progress: 0 (0B)\n default: 118 delete: 0 elision: 0 move: 12 read: 0 rewrite: 0 multi-level: 6\nMemTables: 1 (64MB) zombie: 1 (64MB)\nZombie tables: 0 (0B)\nBacking tables: 0 (0B)\nVirtual tables: 0 (0B)\nBlock cache: 151 entries (7.1MB) hit rate: 15.1%\nTable cache: 37 entries (29KB) hit rate: 99.9%\nSecondary cache: 0 entries (0B) hit rate: 0.0%\nSnapshots: 0 earliest seq num: 0\nTable iters: 1\nFilter utility: 98.4%\nIngestions: 0 as flushable: 0 (0B in 0 tables)\n"]

The metrics above shows that the block cache hit rate is very low, block cache size is also not high。

jimmyyan commented 1 year ago

we got the reason of oom. db.Get() using bloom filter to optimze block read. our targe_file_size config is 64MB, L6 level is 4GB, which result the filter block is so large(6MB-12MB).

from reading the code，we knew that the filter was not cached when read finished，so that large code read will lead the filter block read repeated incresing the memory usage.

Is there any consideration that how much target_file_size should i set to avoid the large filter? and why filter block is not cached?

jbowens commented 1 year ago

from reading the code，we knew that the filter was not cached when read finished

This is not correct. Reading the filter block uses readBlock which will add the filter block to the block cache on misses, and pull the block from the block cache on hits.

How many bits per key are you configuring the filter to use? Fewer bits per-key will reduce the size the filter block at the cost of less utility.

which result the filter block is so large(6MB-12MB).

I do not understand how a filter block of that size could result in OOM unless you're running in a very memory constrained environment.

jimmyyan commented 1 year ago

@jbowens Thanks for your replying。 You are right, the filter block also will be cached. the bits per key is set to 10.

But when the cpu and memory goes high, the flame graph of heap shows that the most memory usage parts goes from the readBlock function, and the alloced object is 6.3MB。

The machine we use have 192 cpu cores and 512GB memory. The cache size is configured to 3GB. Maybe the filter block is too large to fit in the block cache shard resulting the filter block can not be cached and be reloaded repeatedly which cause the oom.

cockroachdb / pebble

Large random read db.Get request simultaneously cause OOM #3039