Open tokers opened 6 years ago
Hi Alex,
I'm not familiar with LuaJIT but my understanding from a quick bit of reading is that the nginx module for OpenResty uses LuaJIT co-routines to run asynchronously and avoid blocking. Internally luaJIT is using longjmp within the co-routine implementation to yield. The trouble is OpenSSL when running asynchronously using the QAT Engine is also using it's own co-routines for each connection. These co-routines (referred to as fibres) are allocating their own stack (from the heap) and switching to their own stack when running and also use longjmp to switch out of the co-routine back to the standard stack. Perhaps there is something bad going on between the two implementations of co-routines. Perhaps something related to both using longjmp and something getting confused or getting corrupted to do with the stack? Anyway that is purely a guess. Unfortunately I'm not aware of any one here running QuickAssist with nginx and also using LuaJIT so I maybe of limited help. I'll let you know if I find out anything further.
Steve.
@stevelinsell Thanks for the response. I think it is irrelevant with the co-routines since when I disabled the LuaJIT Just In Time mode, this problem disappeared. I will ask this issue to the OpenResty community for some helps. Anyway, thanks again!
@stevelinsell I want to toggle the QAT driver version and I found two qat1.7 drivers:
Is there any other qat1.7 drivers?
Hi Alex,
I believe at this moment in time there have only been 3 versions of QAT1.7 drivers made available on 01.org. Starting with the oldest:
Only L.1.0.3-00042 and L.4.2.0-00022 remain available for download.
Kind Regards,
Steve.
Hello!
qat driver configure:
qat engine configure:
lsmod (removes the irrelevant entries):
In addition, we closed the boot option
intel_iommu
.We are using the QAT dh895xcc series card for our own OpenResty/Nginx service. While we pouring traffic the Nginx with qat service, it crashes crazy. Some backtraces like:
The first backtrace, shows that we are attempting to free the pointer where the address is invalid (0x1). The second backtrace, after my own analysis, it crashes when LuaJIT is restoring the stack snapshot (back to interpreter), I have also sent email to luajit community for this issue, by the way, when I disable the JIT compiler, this type of segmentation fault disappears.
After I disable the qat service, our service works well. I don't know wether the qat service causes some memory corruption.
Is there any idea for this issue?
Regards Alex Zhang