opensearch-project / custom-codecs

OpenSearch custom lucene codecs for providing different on-disk index encoding (e.g., compression).
Apache License 2.0
8 stars 15 forks source link

[RFC] Hardware-accelerated Compression #130

Open mulerm opened 8 months ago

mulerm commented 8 months ago

PR https://github.com/opensearch-project/custom-codecs/pull/122 introduced hardware-accelerated compression, leveraging Intel (R) QAT technology, for DEFLATE and LZ4 compression algorithms. The implementation uses the Qat-Java library to interact with the QAT hardware.

The PR also introduced two additional values for index-codec: qat_deflate and qat_lz4. Both codecs are compatible with their corresponding software counterparts, best_compression and default respectively, but do not override them (at least for the time being).

The new setting index.codec.qatmode defines two modes of execution. A hardware mode exclusively uses QAT while an auto mode may fallback and use a software implementation in cases where hardware resources are not available.

Another approach that could be taken is to override best_compression and default such that, in systems where the hardware is available, hardware acceleration is used.

The purpose of this RFC is to initiate a discussion on the pros and cons this last approach.

@reta @sarthakaggarwal97 @wbeckler @andrross

akashsha1 commented 8 months ago

@backslasht , @dblock as well

dblock commented 5 months ago

Thanks for opening this RFC. Catch All Triage - 1 2 3 4 5 6

dblock commented 5 months ago

Did #122 implement what's in this RFC? (Can it be closed?)

andrross commented 5 months ago

Did #122 implement what's in this RFC? (Can it be closed?)

@dblock I don't think this can be closed. Hardware-accelerated compression has been implemented, but they are distinct codecs, which means the indexes cannot be used on hardware that doesn't support the acceleration. However, the codecs themselves are technically compatible with the existing the best_compression and default codecs. I think there is a valid feature request to make the behavior such that the hardware-accelerated version will be used if running on hardware the supports it, otherwise fallback to the software implementation. The big caveat is that we have to guarantee that the different implementations are in fact 100% interoperable.