Open yyang52 opened 11 months ago
Thanks for the nice description and motivation for the change @yyang52 !
@zzhao0 Do we have QAT hardware internally? If yes we can try turn it on internally and see how much it can improve the writing efficiency.
@yyang52 How to check from command line if QAT is supported on a host?
@yyang52 How to check from command line if QAT is supported on a host?
Depends on the drivers. For in-tree drivers,
dmesg | grep qat
lspci | grep 494
For out of tree drivers,
/usr/local/bin/adf_ctl status
Description
Currently, Velox implements its own Dwrf writer, which supports various compression codecs. ZSTD is the default compression method and is widely used in real workloads. While ZSTD has already shown high performance, it still has some room to improve. Intel® QuickAssist Technology (Intel® QAT) provides such capability to do ZSTD compression acceleration.
Intel QAT ZSTD plugin:
Intel® QuickAssist Technology (Intel® QAT) provides cryptographic and compression acceleration capabilities to improve performance and efficiency across the data center. QAT sequence producer will offload the process of producing block-level sequences of L1-L12 compression.
QAT ZSTD Plugin is a plugin to Zstandard(ZSTD) for accelerating compression by QAT. ZSTD provides block-level sequence producer API which allows users to register their custom sequence producer that libzstd invokes to process each block from 1.5.4.
Design & Analysis
We write some writer benchmarks/tests to check the hotspot and see optimization potentials.
Based on our previous tests, when writing Parquet, the top one hotspot becomes ZSTD_compress, which takes around 31% of total CPU time. Writing dwrf should be similar(we're conducting more tests on that), and should provide performance benefit for end2end workload.