Intel® QuickAssist Technology ZSTD Plugin (QAT ZSTD Plugin) is a plugin to Zstandard(ZSTD) for accelerating compression by QAT. ZSTD is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. ZSTD provides block-level sequence producer API which allows users to register their custom sequence producer that libzstd invokes to process each block from 1.5.4. The produced list of sequences (literals and matches) is then post-processed by libzstd to produce valid compressed blocks.
Intel® QuickAssist Technology (Intel® QAT) provides cryptographic and compression acceleration capabilities used to improve performance and efficiency across the data center. QAT sequence producer will offload the process of producing block-level sequences of L1-L12 compression to Intel® QAT, and get performance gain.
The Licensing of the files within this project is split as follows:
Intel® QuickAssist Technology ZSTD Plugin - BSD License. Please see the LICENSE
file contained in the top level folder. Further details can be found in the file headers of the relevant files.
Intel® 4xxx (Intel® QuickAssist Technology Gen 4)
ZSTD* library of version 1.5.4+
Intel® QAT Driver for Linux* Hardware v2.0 or Intel® QuickAssist Technology Library (QATlib) of version 22.07.0+
ZSTD_compress2
, ZSTD_compressStream2
.ZSTD_c_nbWorkers
> 0 and an external sequence producer is registered. Each thread must have its own context (CCtx).For more details about ZSTD* sequence producer, please refer to zstd.h.
Users can choose Intel® QAT Driver for Linux* Hardware v2.0(out-of-tree) or Intel® QuickAssist Technology Library (QATlib)(in-tree) according to their requirements.
If using out-of-tree driver, the user needs to set ICP_ROOT
environment variable:
ICP_ROOT
: the root directory of the QAT driver source tree
Download from Intel® QAT Driver for Linux* Hardware v2.0, follow the guidance: Intel® QuickAssist Technology Software for Linux* - Getting Started Guide.
If installing the Intel® QAT 2.0 driver for use in a virtual environment, please refer to Using Intel® Virtualization Technology (Intel® VT) with Intel® QuickAssist Technology
After installing the QAT driver, please refer to Intel® QuickAssist Technology Software for Linux* - Programmer's Guide to the update QAT configuration file according to requirements.
QAT ZSTD Plugin needs a [SHIM] section by default. There are two ways to change:
After updating the configuration files, please restart QAT.
service qat_service restart
QATlib has been upstream to some platforms, RedHat, SUSE. Users also can install QATlib from source code according to qatlib/INSTALL.
Before build, set the QATlib environment variables to prevent compilation errors.
export LIBRARY_PATH=/usr/local/lib
export LD_LIBRARY_PATH=/usr/local/lib
Shared Virtual Memory (SVM) allows direct submission of an application buffer, thus removing the memcpy cycle cost, cache thrashing, and memory bandwidth. The SVM feature enables passing virtual addresses to the QAT hardware for processing acceleration requests.
QAT sequence producer library runs on the SVM environment by default.
To enable SVM, please refer to Using Intel® Virtualization Technology (Intel® VT) with Intel® QuickAssist Technology to update the BIOS and Intel® QuickAssist Technology Software for Linux* - Programmer's Guide to update driver configuration.
make
If ZSTD 1.5.4 library is not installed, need to specify path to ZSTD lib source root by compile variable "ZSTDLIB".
make ZSTDLIB=[PATH TO ZSTD LIB SOURCE]
If SVM is not enabled, memory passed to Intel® QuickAssist Technology hardware must be DMA enabled.
Intel provides a User Space DMA-able Memory (USDM) component (kernel driver and corresponding user space library) which allocates/frees DMA-able memory, mapped to user space, performs virtual to physical address translation on memory allocated by this library. Please refer to Intel® QuickAssist Technology Software for Linux* - Programmer's Guide.
QAT ZSTD Plugin will automatically switch to USDM mode when SVM is not enabled.
make test
./test/test [TEST FILENAME]
The benchmark
is a tool used to perform QAT sequence producer performance tests, it supports the following options:
-t# Set maximum threads [1 - 128] (default: 1)
-l# Set iteration loops [1 - 1000000](default: 1)
-c# Set chunk size (default: 32K)
-E# Auto/enable/disable searchForExternalRepcodes(0: auto; 1: enable; 2: disable; default: auto)
-L# Set compression level [1 - 12] (default: 1)
-m# Benchmark mode, 0: software compression; 1:QAT compression(default: 1)
In order to get a better performance, increasing the number of threads with -t
is a better way. The number of dc instances provided by Intel® QAT needs to be increased while increasing test threads, it can be increased by modifying the NumberDcInstances
in /etc/4xxx_devx.conf
. Note that the test threads number should not exceed the number of dc instances, as this ensures that each test thread can obtain a dc instance.
For more Intel® QAT configuration information, please refer to Intel® QuickAssist Technology Software for Linux* - Programmer's Guide.
An example usage of benchmark tool with Silesia compression corpus:
Silesia is standard lossless data compression corpora.
./benchmark -m1 -l100 -c64K -t64 -E2 Silesia
which used the following Intel® QAT configuration file:
# QAT configuration file /etc/4xxx_devx.conf
##############################################
# User Process Instance Section
##############################################
[SHIM]
NumberCyInstances = 0
NumberDcInstances = 64
NumProcesses = 1
LimitDevAccess = 0
# Data Compression - User instance #0
Dc1Name = "Dc0"
Dc1IsPolled = 1
# List of core affinities
Dc1CoreAffinity = 0
# Data Compression - User instance #1
Dc2Name = "Dc1"
Dc2IsPolled = 1
# List of core affinities
Dc2CoreAffinity = 1
...
# Data Compression - User instance #63
Dc63Name = "Dc63"
Dc63IsPolled = 1
# List of core affinities
Dc63CoreAffinity = 63
zstd
Integrating QAT sequence producer into the zstd
command can speed up its compression, The following sample code shows how to enable QAT sequence producer by modifying the code of FIO_compressZstdFrame
in zstd/programs/fileio.c
, including qatseqprod.h in fileio.c and adding -lqatseqprod into Makefile.
Start QAT device and register qatSequenceProducer before starting compression job.
/* Start QAT device, start QAT device at any
time before compression job started */
QZSTD_startQatDevice();
/* Create sequence producer state for QAT sequence producer */
void *sequenceProducerState = QZSTD_createSeqProdState();
/* register qatSequenceProducer */
ZSTD_registerSequenceProducer(
ress.cctx,
sequenceProducerState,
qatSequenceProducer
);
/* Enable sequence producer fallback */
ZSTD_CCtx_setParameter(ress.cctx, ZSTD_c_enableSeqProducerFallback, 1);
Stop QAT device after compression job
/* Free sequence producer state */
QZSTD_freeSeqProdState(sequenceProducerState);
/* Please call QZSTD_stopQatDevice before
QAT is no longer used or the process exits */
QZSTD_stopQatDevice();
Then recompile zstd
with flag -lqatseqprod
. Currently, only single-threaded mode compression is supported to using QAT sequence producer, please run zstd
with the --single-thread
.
Note : some parameters of zstd
do not support sequence producer, for more zstd usage information please refer to zstd manual.
./zstd --single-thread [TEST FILENAME]
Initialization
Start and initialize the QAT device.
Create sequence producer state for QAT sequence producer, then call ZSTD_registerSequenceProducer
to register it in the application source code.
ZSTD_CCtx* const zc = ZSTD_createCCtx();
/* Start QAT device, start QAT device at any
time before compression job started */
QZSTD_startQatDevice();
/* Create sequence producer state for QAT sequence producer */
void *sequenceProducerState = QZSTD_createSeqProdState();
/* register qatSequenceProducer */
ZSTD_registerSequenceProducer(
zc,
sequenceProducerState,
qatSequenceProducer
);
/* Enable sequence producer fallback */
ZSTD_CCtx_setParameter(zc, ZSTD_c_enableSeqProducerFallback, 1);
Compression API
No changes to the application with calling ZSTD* compression API, keep calling ZSTD_compress2
, ZSTD_compressStream2
, or ZSTD_compressStream
to compress.
/* Compress */
ZSTD_compress2(zc, dstBuffer, dstBufferSize, srcBuffer, srcbufferSize);
Free resources and shutdown QAT device
/* Free sequence producer state */
QZSTD_freeSeqProdState(sequenceProducerState);
/* Please call QZSTD_stopQatDevice before
QAT is no longer used or the process exits */
QZSTD_stopQatDevice();
Then link to libzstd and libqatseqprod like test program did. See the DEMO in test/test.c file
Intel® disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel® representative to obtain the latest forecast, schedule, specifications and roadmaps.
The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.
Intel, the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others