Open Wang-Robot opened 1 month ago
Thanks for raising the issue @Wang-Robot . I am looking into this and get back to you soon.
Thank you, looking forward to your latest progress @venkatesh6911
There was a discussion on this and it got the feedback that, for 8K and 16K packet sizes, the code path in the QAT firmware is not optimal and it is only generic. That is why we see the underperformance. This has nothing to do with QAT Engine though.
Thanks. According to our understanding, the larger the packet length, the better the performance. Do we have any special optimization for 4K packet length? In addition, what does QAT firmware mean, qatlib?
When using QAT acceleration, why does the performance of the aes-128-cbc-hmac-sha256 algorithm with a packet length of 4K perform better than that of 8K or even 16K?
2.Run results (4 cores and 8 cores are bound respectively, and you can see that the performance data of 4K is higher than that of 8K) (the default data length is: 16:64:256:1024:8192:16384; the data of 4096 is run separately)
[root@emer QAT_Engine-1.6.0]# taskset -c 1-4 openssl speed -elapsed -evp aes-128-cbc-hmac-sha256 -async_jobs 48 -multi 4 --engine qatengine -bytes 4096 ... ... ... ... Got: +H:4096 from 0 AES-128-CBC-HMAC-SHA256 4850803.74k
[root@emer QAT_Engine-1.6.0]# taskset -c 1-8 openssl speed -elapsed -evp aes-128-cbc-hmac-sha256 -async_jobs 48 -multi 8 --engine qatengine ... ... ... ... Got: +H:16:64:256:1024:8192:16384 from 7 AES-128-CBC-HMAC-SHA256 43087.70k 177992.30k 736275.11k 2915154.60k 4146301.61k 5514980.01k
[root@emer QAT_Engine-1.6.0]# taskset -c 1-8 openssl speed -elapsed -evp aes-128-cbc-hmac-sha256 -async_jobs 48 -multi 8 --engine qatengine --bytes 4096 ... ... ... ... Got: +H:4096 from 0 AES-128-CBC-HMAC-SHA256 6092023.13k