Closed jmbnyc closed 1 month ago
sljit stacktraces might normally contain those, if the crash is in the generated code.
if you have a core dump with the crash then a backtrace and a disassemble of the crash x/16i $pc-32
, together with the expression that crashed it (specially if it is reproducible) will help.
We are using version 42.
Assume you mean 10.42. If your application is threaded then probably should upgrade to 10.44. Also see #435
gdb usually does not support backtraces for jit code. If the issue can be reproduced easily, than it is better to put a breakpoint before the jit code is executed: https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2_jit_match.c#L91
You can use ignore
or condition
gdb commands to stop the right time, then get a backtrace. If you don't know how many times the breakpoint needs to be ignored, you can set a huge number, such as 1000000
, and use info breakpoints
to get the number of ignores before the issue. That number-1 is a good number for the next ignore
.
does this mean anything to anyone that has been kind enough to respond?
(gdb) x/16i $pc-32
0x2e5c8c5 <pcre2_jit_match_8+496>: and $0x48,%al
0x2e5c8c7 <pcre2_jit_match_8+498>: mov -0x8(%rbp),%eax
0x2e5c8ca <pcre2_jit_match_8+501>: mov 0x18(%rax),%rax
0x2e5c8ce <pcre2_jit_match_8+505>: mov %rax,-0xa0(%rbp)
0x2e5c8d5 <pcre2_jit_match_8+512>: mov -0x38(%rbp),%rdx
0x2e5c8d9 <pcre2_jit_match_8+516>: lea -0xa0(%rbp),%rax
0x2e5c8e0 <pcre2_jit_match_8+523>: mov %rax,%rdi
0x2e5c8e3 <pcre2_jit_match_8+526>: callq *%rdx
=> 0x2e5c8e5 <pcre2_jit_match_8+528>: mov %eax,-0x10(%rbp)
0x2e5c8e8 <pcre2_jit_match_8+531>: jmp 0x2e5c903 <pcre2_jit_match_8+558>
0x2e5c8ea <pcre2_jit_match_8+533>: mov -0x38(%rbp),%rdx
0x2e5c8ee <pcre2_jit_match_8+537>: lea -0xa0(%rbp),%rax
0x2e5c8f5 <pcre2_jit_match_8+544>: mov %rdx,%rsi
0x2e5c8f8 <pcre2_jit_match_8+547>: mov %rax,%rdi
0x2e5c8fb <pcre2_jit_match_8+550>: callq 0x2e5c652
The crash is not in a jit code, it is in pcre2_jit_match_8
. The callq *%rdx
is an indirect call, the target is loaded by mov -0x38(%rbp),%rdx
. You should check if rbp
contains a valid stack location. Maybe the call does not restore it properly. It would be good to know what is called there.
Probably this is the location: https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2_jit_match.c#L171
I need to correct myself. If you use pcre2_match
and pcre2_jit_stack_assign
then you need a separate match context. If you use pcre2_jit_match()
you don't need it.
zherczeg, Thanks for your response. I determined the same thing and determined that the cause was concurrent calls to pcre2_jit_stack_assign with the same match context and different jit stack memory. As I mentioned in another post, once I read the code, it was obvious that match context must be thread local (in my code). Net/Net, my thread local for matching now contains match data, match context, and jit stack. Each regex pattern is matched using a thread local object where the jit stack assign can be done during thread local init.
I appreciate the help here as it allowed me to debug and figure this out. As I mentioned, the docs did not make it completely clear that match data, match context and jit stack all need to be thread local to allow concurrent matching against a pattern (represented by a pcre2_code object. I probably should have read the code first because it becomes very clear what is required to get thread safe concurrent matching.
My team and I are still working to confirm but we are seeing a crash inside pcre2_jit_match. Unfortunately gdb is not very helpful because we get a huge set of stacks with ??. We might be able to do better if we pull the pcre code into our main code base instead of loading it as a library. However, we are wondering how we can debug? Do you have any suggestions on how we can narrow down the issue we might be encountering. We are using version 42.