opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.54k stars 1.75k forks source link

[Searchable Snapshot] Revist using other open source LRUCache implementation for File System caching #6225

Open aabukhalil opened 1 year ago

aabukhalil commented 1 year ago

Coming from #4964 and based on concerns raised from https://github.com/opensearch-project/OpenSearch/pull/5641#pullrequestreview-1285817231

More context is in https://github.com/opensearch-project/OpenSearch/issues/4964#issuecomment-1421689586, https://github.com/opensearch-project/OpenSearch/issues/4964#issuecomment-1421722470 and https://github.com/opensearch-project/OpenSearch/pull/5641#discussion_r1099518731

So far we have evaluated using Guava/Caffeine (https://github.com/opensearch-project/OpenSearch/issues/4964#issuecomment-1421722470) cache but they don't support custom eviction logic (prevent evicting entries when they have active usage) and they are not open for extension. That's why https://github.com/opensearch-project/OpenSearch/pull/5641 has modified version of Guava/Caffeine cache. A new suggestion on how to use Caffeine cache came here so we need to re-evaluate using Guava/Caffeine again.

Apache JCS was suggested in https://github.com/opensearch-project/OpenSearch/pull/5641 and it look promising because it is more open for extension rather than modification but still I'm not able to find a straight forward path/ perfectly fitting implementation for our cache usage.

aabukhalil commented 1 year ago

When evaluating which cache implementation to use or if to stick with in house implementation, we need to think about near future use cases (within 1 year). I can think of:

aabukhalil commented 1 year ago

this is exactly how we can utilize Caffeine cache https://github.com/opensearch-project/OpenSearch/pull/5641#discussion_r1102032394

andrross commented 1 year ago

I put together a quick commit on my fork that shows how we can replace the hand-rolled LinkedDeque with a LinkedHashMap.

reta commented 1 year ago

I put together a quick commit on my fork that shows how we can replace the hand-rolled LinkedDeque with a LinkedHashMap.

Have you run the benchmarks (https://github.com/opensearch-project/OpenSearch/pull/6610) against LinkedHashMap alternative? Just curious what would be the diff

andrross commented 1 year ago

Have you run the benchmarks (#6610) against LinkedHashMap alternative? Just curious what would be the diff

@reta I put a PR together here with the benchmark results: #6968

andrross commented 1 year ago

Just want to provide some updates and more context here:

The overall functionality provided by this cache feature is that remote index data is pulled down from the object store in chunks (8MiB by default) and written to local disk, and searches are served using the chunks from the local disk. The total size of the disk available for use as the "cache" is fixed and defined by a node settings. When that total size is exhausted, unused chunks are deleted from the disk to make room for new chunks.

This is implemented (after #6968) by using two Maps:

All operations on the cache are guarded by a simple ReentrantLock that prevents concurrent access. To provide additional concurrency, a segmented approach is taken where Runtime.getRuntime().availableProcessors() number of caches are created, and each cache is set to use disk cache size / Runtime.getRuntime().availableProcessors() as its capacity.

In my opinion, the segmented approach adds complexity and is still vulnerable to hot keys. I think the main improvement we can make here is to get rid of the segmented cache and go to a single implementation with finer-grained locking. Best-effort semantics on the disk usage and eviction are likely okay, as over- or under-using the cache by a small percentage is probably acceptable. The current implementation is best-effort anyway as the size is enforced at each segment, and not the overall sum of the segments.

reta commented 1 year ago

In my opinion, the segmented approach adds complexity and is still vulnerable to hot keys. I think the main improvement we can make here is to get rid of the segmented cache and go to a single implementation with finer-grained locking.

The complexity is certainly here and the hot keys could significantly reduce its effectiveness, +1 to look for simpler and more efficient way to (re)implement that.