Open wathenjiang opened 2 months ago
Your observation is correct. This is by design. We trade memory for execution efficiency. I won't call it memory leak because the memory is still tracked in the LRU and will be reclaimed when the LRU evicts it eventually. The code just doesn't free it right away.
At the time we implemented this logic we did not know ways to do caching without lock contention. Now we have things like TinyUFO. So we will optimize this in the future.
Describe the bug
RwLock<ThreadLocal<RefCell<T>>>
. Please see https://github.com/cloudflare/pingora/blob/845c30d0825e7faf5e13ea88a9cc4949082bf00c/pingora-pool/src/lru.rs#L42-L50PoolNode
.idle_poll
async task(such as therelease_http_session
method inv2.rs
). Please see https://github.com/cloudflare/pingora/blob/845c30d0825e7faf5e13ea88a9cc4949082bf00c/pingora-core/src/connectors/mod.rs#L240-L243 and https://github.com/cloudflare/pingora/blob/845c30d0825e7faf5e13ea88a9cc4949082bf00c/pingora-pool/src/connection.rs#L218-L222pop_closed
method, it will successfully remove the Connection from thePoolNode
, but it cannot remove it from theLru
.Steps to reproduce
The existence of this memory leak can be proven through the following tests:
Console printing:
Currently, the maximum size of memory leaks depends on the size setting of the ConnectionPool.
If I haven't missed any other important information, I guess the
RwLock<ThreadLocal<RefCell<T>>
is mainly to reduce lock contention. Another solution to this is to use the sharded lock way(every connection has its ID, which should not be difficult to achieve).There are use cases in the Rocksdb project on how to use sharded cache, please refer to: https://github.com/facebook/rocksdb/blob/master/cache/sharded_cache.cc