opensearch-project / custom-codecs

OpenSearch custom lucene codecs for providing different on-disk index encoding (e.g., compression).
Apache License 2.0
6 stars 13 forks source link

[FEATURE] Hybrid Compression with Zstandard Codecs #149

Open sarthakaggarwal97 opened 1 month ago

sarthakaggarwal97 commented 1 month ago

Is your feature request related to a problem?

Coming here from https://github.com/opensearch-project/OpenSearch/issues/11605#issuecomment-2046521026

We are observing good benefits from enabling Hybrid Compression during merges with default codec. That being said, I believe we would see more benefits with Zstandard since it has much better compression ratio that default codec, so the trade offs on disk throughput / iops should be reduced.

Since we would not need to introduce a new codec to implement hybrid compression with Zstandard, I propose to have an opt in settings that allows users to manage the thresholds for hybrid compression.

What solution would you like?

Hybrid Compression for Zstandard where the recently added stored fields are not compressed, which enables faster indexing, look ups for recent documents, but the stored fields are compressed upon merges.

Do you have any additional context?

Detailed Description: https://github.com/opensearch-project/OpenSearch/issues/13110

dblock commented 1 week ago

Catch All Triage - 1 2 3 4 5 6