Xeon server with 2 sockets INTEL(R) XEON(R) GOLD 6554S
QAT physical devices: 2 socket * 4 / socket
In a system with 128 QAT VFs, each with 2 CyInstances and LimitDevAccess set to 0. Totally 512 instances shall be generated either sym or asym. A Segmentation fault happens when using openssl engine -c -t -v qatengine to validate.
After debugging it, the root cause might be the predefined QAT_MAX_CRYPTO_INSTANCES to 256. In our case, it should be 512 to accommodate all instances.
After adding some log in qat_hw_init.c
for (instNum = 0; instNum < qat_num_instances; instNum++) {
/* Retrieve CpaInstanceInfo2 structure for that instance */
printf("addr ptr of pInstanceInfo2: %ld \n", (unsigned long)&qat_instance_details[instNum].qat_instance_info);
printf("sizeof CpaInstanceInfo2: %ld \n", sizeof(CpaInstanceInfo2));
printf("addr ptr of qat_instance_handles: %ld\n",(unsigned long)&qat_instance_handles);
status = cpaCyInstanceGetInfo2(qat_instance_handles[instNum],
&qat_instance_details[instNum].qat_instance_info);
before cpaCySetAddressTranslation; instNum: 255; qat_instance_handle: 0x5555562be9d0
cpaCySetAddressTranslation() - : Called with params (0x5555562be9d0, 0x7ffff60112a0)
cpaCyStartInstance() - : Called with params (0x5555562be9d0)
cpaCyInstanceGetInfo2() - : Called with params (0x5555562be9d0, 0x7fffffffd5e0)
addr ptr of pInstanceInfo2: 140737323112320
sizeof CpaInstanceInfo2: 932
addr ptr of qat_instance_handles: 140737323112376
cpaCyInstanceGetInfo2() - : Called with params (0x5555562d1d10, 0x7ffff6269780)
Hardware watchpoint 1: qat_instance_handles
Old value = (CpaInstanceHandle *) 0x555556d60150
New value = (CpaInstanceHandle *) 0x0
0x00007ffff64d0163 in __memset_avx2_unaligned_erms () from /lib64/libc.so.6
(gdb) where
#0 0x00007ffff64d0163 in __memset_avx2_unaligned_erms () from /lib64/libc.so.6
#1 0x00007ffff5c8c43a in osalMemSet (ptr=0x7ffff6269780 <qat_instance_mutex>, filler=0 '\000', count=932) at /root/QAT20/quickassist/utilities/osal/src/linux/user_space/OsalServices.c:285
#2 0x00007ffff5c78a59 in cpaCyInstanceGetInfo2 (instanceHandle_in=0x5555562d1d10, pInstanceInfo2=0x7ffff6269780 <qat_instance_mutex>) at /root/QAT20/quickassist/lookaside/access_layer/src/common/ctrl/sal_crypto.c:3064
#3 0x00007ffff6011d50 in qat_hw_init (e=e@entry=0x55555582f2b0) at qat_hw_init.c:642
#4 0x00007ffff600eff0 in qat_engine_init (e=0x55555582f2b0) at e_qat.c:607
#5 0x00007ffff75564fd in engine_unlocked_init () from /lib64/libcrypto.so.1.1
#6 0x00007ffff7556658 in ENGINE_init () from /lib64/libcrypto.so.1.1
#7 0x000055555559dd29 in engine_main ()
#8 0x00005555555a3244 in do_cmd ()
#9 0x000055555558bf59 in main ()
When instNum becomes 256, addr of pInstanceInfo2 is 140737323112320 and it will memset 932 bytes, whose addr will be overlapped with that of qat_instance_handles
After changing QAT_MAX_CRYPTO_INSTANCES to 512, the error disappears
Software
Hardware
In a system with 128 QAT VFs, each with 2
CyInstances
andLimitDevAccess
set to0
. Totally 512 instances shall be generated eithersym
orasym
. A Segmentation fault happens when usingopenssl engine -c -t -v qatengine
to validate.Steps to reproduce:
After debugging it, the root cause might be the predefined
QAT_MAX_CRYPTO_INSTANCES
to 256. In our case, it should be 512 to accommodate all instances.After adding some log in
qat_hw_init.c
When
instNum
becomes256
, addr ofpInstanceInfo2
is140737323112320
and it willmemset
932 bytes, whose addr will be overlapped with that ofqat_instance_handles
After changing
QAT_MAX_CRYPTO_INSTANCES
to 512, the error disappears