php / php-src

The PHP Interpreter
https://www.php.net
Other
37.93k stars 7.72k forks source link

Reproducible crash in PHP 8.4 with JIT and Excimer #15502

Closed tstarling closed 2 weeks ago

tstarling commented 3 weeks ago

Description

php8.4 maintenance/run.php install --dbname=mw_installtest --dbserver=localhost --dbuser=user --dbpass=pass --pass=$(md5sum README.md) installtest Admin

Create the file phptest.php:

<?php
$excimer = new ExcimerProfiler;
$excimer->setPeriod( 0.001 );
$excimer->setEventType( EXCIMER_REAL );
$excimer->start();

require 'index.php';

Run the script repeatedly with the tracing JIT:

SCRIPT_NAME=index.php REMOTE_ADDR=1.2.3.4 /usr/lib/cgi-bin/php8.4 -T1000 -dextension=excimer.so -dopcache.jit=on phptest.php > /dev/null

It crashes with:

zend_mm_heap corrupted
Aborted (core dumped)

Building with AddressSanitizer, it always crashes in AutoLoader::find() due to an incorrect reference count on $prefix leading to it being freed while still in use.

In gdb, use set print thread-events off since Excimer is creating 1000 threads per second.

This is not an excimer bug. I'm pretty sure. Excimer is just a way to randomly set EG(vm_interrupt), which exercises some otherwise unusual exits in the JITed code of this function. It still crashes with the whole interrupt handler disabled:

diff --git a/excimer_timer.c b/excimer_timer.c
index 83bf42a..36e4289 100644
--- a/excimer_timer.c
+++ b/excimer_timer.c
@@ -239,6 +239,7 @@ static void excimer_timer_handle(union sigval sv)

 static void excimer_timer_interrupt(zend_execute_data *execute_data)
 {
+       return;
        zend_long id;
        zval *zp_count;
        HashTable *event_counts;

I mean, it is Excimer-related, and I as an Excimer maintainer am trying to fix it. But I'm getting deep into the JIT internals and that's probably where the bug is.

The bug is new in 8.4, I could not reproduce it in 8.3.

PHP Version

8.4.0alpha4

Operating System

Ubuntu 22.04

cmb69 commented 3 weeks ago

The bug is new in 8.4, I could not reproduce it in 8.3.

Might be caused by the new IR JIT implementation.

tstarling commented 3 weeks ago

In slightly different conditions to the reproduction procedure above, the error always occurs with return address 0x48000d29, which is the zif_strrpos() call in TRACE-1$AutoLoader::find$181. Always that first trace out of hundreds.

The string's refcount is not off by one, it's garbage, like a billion, and yet the string has the right value in it.

AddressSanitizer shows a use after free, but the backtrace for the allocation is distant and random. Often it fails with an assertion while trying to print the backtrace.

iluuu1994 commented 3 weeks ago

/cc @dstogov

tstarling commented 3 weeks ago

I think I've got it pretty well isolated.

I reduced the test code by making a tight loop of Autoloader::find() calls instead of running the whole of MediaWiki.

The bug does not seem to happen when opcache.jit_max_side_traces=0.

There are three relevant traces:

Code with line numbers for reference:

181:            while ( ( $pos = strrpos( $prefix, '\\' ) ) !== false ) {
182:                // Check to see if this namespace prefix is in the map
183:                $prefix = substr( $className, 0, $pos + 1 );
184:                if ( isset( self::$psr4Namespaces[$prefix] ) ) {
185:                    $relativeClass = substr( $className, $pos + 1 );
186:                    // Build the expected filename, and see if it exists
187:                    $file = self::$psr4Namespaces[$prefix] .
188:                        '/' .
189:                        strtr( $relativeClass, '\\', '/' ) .
190:                        '.php';
191:                    if ( is_file( $file ) ) {
192:                        $filename = $file;
193:                        break;
194:                    }
195:                }
196:
197:                // Remove trailing separator for next iteration
198:                $prefix = rtrim( $prefix, '\\' );
199:            }

The bug occurs when EG(vm_interrupt) is set leading to TRACE-125 exit 3. A debug message is not emitted in that case, but I fixed that so that we see:

     TRACE 125 exit 3 AutoLoader::find() /srv/mw/core/includes/AutoLoader.php:198

The interrupt handler stub is then called, but the opline is wrong. It returns to the VM, but execution resumes at the ZEND_ASSIGN op on line 198, not at the rtrim() call. Both operands of the ZEND_ASSIGN at this point refer to the same string ($prefix). The ZEND_ASSIGN handler decrements the reference count of $prefix, causing it to be freed. Then the loop continues, re-entering TRACE-1 and crashing in strrpos().

tstarling commented 3 weeks ago

I think the zend_jit_check_timeout() call in zend_jit_trace_link_to_root() is at fault. Commenting it out seems to fix the issue. It passes NULL for opline but presumably should pass whatever the right answer is.

dstogov commented 2 weeks ago

@tstarling Can you please check, if the following patch fixes the problem.

diff --git a/ext/opcache/jit/zend_jit_ir.c b/ext/opcache/jit/zend_jit_ir.c
index bb0caf477e5..b31c66016dd 100644
--- a/ext/opcache/jit/zend_jit_ir.c
+++ b/ext/opcache/jit/zend_jit_ir.c
@@ -16729,7 +16729,7 @@ static int zend_jit_trace_link_to_root(zend_jit_ctx *jit, zend_jit_trace_info *t
    link_addr = (const void*)((const char*)t->code_start + zend_jit_trace_prologue_size);

    if (timeout_exit_addr) {
-       zend_jit_check_timeout(jit, NULL, timeout_exit_addr);
+       zend_jit_check_timeout(jit, t->opline, timeout_exit_addr);
    }
    ir_IJMP(ir_CONST_ADDR(link_addr));
dstogov commented 2 weeks ago

No, my patch is wrong. The opline passed to zend_jit_check_timeout() doesn't matter when exit_addr is passed.

The target opline should be set by deoptimization process. The corresponding deoptimization info is created here https://github.com/php/php-src/blob/master/ext/opcache/jit/zend_jit_trace.c#L7162 and these lines exactly the same in PHP_8_3.

dstogov commented 2 weeks ago
TRACE 125 exit 3 AutoLoader::find() /srv/mw/core/includes/AutoLoader.php:198

The interrupt handler stub is then called, but the opline is wrong. It returns to the VM, but execution resumes at the ZEND_ASSIGN op on line 198, not at the rtrim() call. Both operands of the ZEND_ASSIGN at this point refer to the same string ($prefix). The ZEND_ASSIGN handler decrements the reference count of $prefix, causing it to be freed. Then the loop continues, re-entering TRACE-1 and crashing in strrpos().

Can you check the deoptimization info for TRACE-125 exit 3 is correct? (it should be printed after TRACE-125 disassemble with opcache.jit_debug=0x1ff001) e.g exit_3: XXXX/... where XXX is the target opline number. It should be the same as the first opline of the root trace.

dstogov commented 2 weeks ago

@bwoebi @morrisonlevi may be you have any ideas about the reason of failure.

tstarling commented 2 weeks ago

Can you check the deoptimization info for TRACE-125 exit 3 is correct?

It's hard to say what is correct. The exit points to the assignment op, but maybe the assignment was already done by the JIT.

Today I'm actually following the reproduction procedure instead of using my main MW test install. So some things have moved around a bit. The source line numbers have changed by 2 compared to what I previously pasted. You can see the source file at https://github.com/wikimedia/mediawiki/blob/1.42.1/includes/AutoLoader.php . The trace numbers and exit numbers have changed.

I'm using this PHP: https://github.com/tstarling/php-src/tree/autoload-debug . It is git master from last week, with a few patches, most notably a patch to show a JIT exit message even when vm_interrupt is set.

Today the crash usually happens at trace 26 exit 9.

The op array dump:

AutoLoader::find:
     ; (lines=67, args=1, vars=8, tmps=4)
     ; (after optimizer)
     ; /var/tmp/mediawiki-1.42.1/includes/AutoLoader.php:165-212
0000 CV0($className) = RECV 1
0001 BIND_GLOBAL CV1($wgAutoloadLocalClasses) string("wgAutoloadLocalClasses")
0002 BIND_GLOBAL CV2($wgAutoloadClasses) string("wgAutoloadClasses")
0003 T9 = FETCH_DIM_IS CV1($wgAutoloadLocalClasses) CV0($className)
0004 T8 = COALESCE T9 0013
0005 T9 = FETCH_STATIC_PROP_IS string("classFiles") (self) (exception)
0006 T10 = FETCH_DIM_IS T9 CV0($className)
0007 T9 = COALESCE T10 0012
0008 T11 = FETCH_DIM_IS CV2($wgAutoloadClasses) CV0($className)
0009 T10 = COALESCE T11 0011
0010 T10 = QM_ASSIGN bool(false)
0011 T9 = QM_ASSIGN T10
0012 T8 = QM_ASSIGN T9
0013 CV3($filename) = QM_ASSIGN T8
0014 JMPNZ CV3($filename) 0057
0015 T9 = FRAMELESS_ICALL_2(strpos) CV0($className) string("\\")
0016 T8 = TYPE_CHECK TYPE [null, true, long, double, string, array, object, resource] T9
0017 JMPZ T8 0055
0018 CV4($prefix) = QM_ASSIGN CV0($className)
0019 JMP 0049
0020 T9 = ADD CV5($pos) int(1)
0021 T8 = FRAMELESS_ICALL_3(substr) CV0($className) int(0)
0022 OP_DATA T9
0023 ASSIGN CV4($prefix) T8
0024 T9 = FETCH_STATIC_PROP_IS string("psr4Namespaces") (self) (exception)
0025 T8 = ISSET_ISEMPTY_DIM_OBJ (isset) T9 CV4($prefix)
0026 JMPZ T8 0044
0027 T9 = ADD CV5($pos) int(1)
0028 T8 = FRAMELESS_ICALL_2(substr) CV0($className) T9
0029 ASSIGN CV6($relativeClass) T8
0030 T8 = FETCH_STATIC_PROP_R string("psr4Namespaces") (self) (exception)
0031 T9 = FETCH_DIM_R T8 CV4($prefix)
0032 T8 = CONCAT T9 string("/")
0033 T10 = FRAMELESS_ICALL_3(strtr) CV6($relativeClass) string("\\")
0034 OP_DATA string("/")
0035 T9 = FAST_CONCAT T8 T10
0036 T8 = FAST_CONCAT T9 string(".php")
0037 ASSIGN CV7($file) T8
0038 INIT_FCALL 1 96 string("is_file")
0039 SEND_VAR CV7($file) 1
0040 V8 = DO_ICALL
0041 JMPZ V8 0044
0042 ASSIGN CV3($filename) CV7($file)
0043 JMP 0055
0044 INIT_FCALL 2 112 string("rtrim")
0045 SEND_VAR CV4($prefix) 1
0046 SEND_VAL string("\\") 2
0047 V8 = DO_ICALL
0048 ASSIGN CV4($prefix) V8
0049 INIT_FCALL 2 112 string("strrpos")
0050 SEND_VAR CV4($prefix) 1
0051 SEND_VAL string("\\") 2
0052 CV5($pos) = DO_ICALL
0053 T8 = TYPE_CHECK TYPE [null, true, long, double, string, array, object, resource] CV5($pos)
0054 JMPNZ T8 0020
0055 JMPNZ CV3($filename) 0057
0056 RETURN null
0057 T9 = FETCH_DIM_R CV3($filename) int(0)
0058 T8 = IS_NOT_IDENTICAL T9 string("/")
0059 JMPZ T8 0065
0060 T9 = FETCH_DIM_R CV3($filename) int(1)
0061 T8 = IS_NOT_IDENTICAL T9 string(":")
0062 JMPZ T8 0065
0063 T8 = CONCAT string("/var/tmp/mediawiki-1.42.1/includes/../") CV3($filename)
0064 ASSIGN CV3($filename) T8
0065 VERIFY_RETURN_TYPE CV3($filename)
0066 RETURN CV3($filename)
LIVE RANGES:
     8: 0022 - 0023 (tmp/var)
     8: 0033 - 0035 (tmp/var)
     10: 0034 - 0035 (tmp/var)

The side trace:

---- TRACE 26 start (side trace 12/6) AutoLoader::find() /var/tmp/mediawiki-1.42.1/includes/AutoLoader.php:183
0027 T9 = ADD CV5($pos) int(1) ; op1(int)
0028 T8 = FRAMELESS_ICALL_2(substr) CV0($className) T9 ; op1(string) op2(int)
0029 ASSIGN CV6($relativeClass) T8 ; op1(undef) op2(string)
0030 T8 = FETCH_STATIC_PROP_R string("psr4Namespaces") (self) (exception)
0031 T9 = FETCH_DIM_R T8 CV4($prefix) ; op1(array) op2(string) val(string)
0032 T8 = CONCAT T9 string("/") ; op1(string)
0033 T10 = FRAMELESS_ICALL_3(strtr) CV6($relativeClass) string("\\") ; op1(string)
0034 ;OP_DATA string("/")
0035 T9 = FAST_CONCAT T8 T10 ; op1(string) op2(string)
0036 T8 = FAST_CONCAT T9 string(".php") ; op1(string)
0037 ASSIGN CV7($file) T8 ; op1(undef) op2(string)
0038 INIT_FCALL 1 96 string("is_file")
     >init is_file
0039 SEND_VAR CV7($file) 1 ; op1(string)
0040 V8 = DO_ICALL
     >call is_file
0041 JMPZ V8 0044 ; op1(bool)
0044 INIT_FCALL 2 112 string("rtrim")
     >init rtrim
0045 SEND_VAR CV4($prefix) 1 ; op1(string)
0046 SEND_VAL string("\\") 2
0047 V8 = DO_ICALL
     >call rtrim
0048 ASSIGN CV4($prefix) V8 ; op1(string) op2(string)
---- TRACE 26 stop (link to 12)
---- TRACE 26 TSSA start (side trace 12/6) AutoLoader::find() /var/tmp/mediawiki-1.42.1/includes/AutoLoader.php:183
     ;#0.CV0($className) [rc1, rcn, any]
     ;#4.CV4($prefix) [rc1, rcn, string]
     ;#5.CV5($pos) [long]
     ;#6.CV6($relativeClass) NOVAL [undef, rc1, rcn, string]
     ;#7.CV7($file) NOVAL [undef, rc1, rcn, string]
0027 #12.T9 [long, double] = ADD #5.CV5($pos) [long] int(1) ; op1(int)
0028 #14.T8 [rc1, rcn, string] = FRAMELESS_ICALL_2(substr) #0.CV0($className) [rc1, rcn, any] -> #13.CV0($className) NOVAL [rc1, rcn, any] #12.T9 [long, double] ; op1(string) op2(int)
0029 ASSIGN #6.CV6($relativeClass) NOVAL [undef, rc1, rcn, string] -> #15.CV6($relativeClass) [rc1, rcn, string] #14.T8 [rc1, rcn, string] ; op1(undef) op2(string)
0030 #16.T8 [!rcn, array of [any, ref]] = FETCH_STATIC_PROP_R string("psr4Namespaces") (self) (exception)
0031 #17.T9 [!rcn, string] = FETCH_DIM_R #16.T8 [!rcn, array of [any, ref]] #4.CV4($prefix) [rc1, rcn, string] ; op1(array) op2(string) val(string)
0032 #18.T8 [rc1, rcn, string] = CONCAT #17.T9 [!rcn, string] string("/") ; op1(string)
0033 #20.T10 [rc1, rcn, string] = FRAMELESS_ICALL_3(strtr) #15.CV6($relativeClass) [rc1, rcn, string] -> #19.CV6($relativeClass) NOVAL [rc1, rcn, string] string("\\") ; op1(string)
0034 ;OP_DATA string("/")
0035 #21.T9 [rc1, rcn, string] = FAST_CONCAT #18.T8 [rc1, rcn, string] #20.T10 [rc1, rcn, string] ; op1(string) op2(string)
0036 #22.T8 [rc1, rcn, string] = FAST_CONCAT #21.T9 [rc1, rcn, string] string(".php") ; op1(string)
0037 ASSIGN #7.CV7($file) NOVAL [undef, rc1, rcn, string] -> #23.CV7($file) [rc1, rcn, string] #22.T8 [rc1, rcn, string] ; op1(undef) op2(string)
0038 INIT_FCALL 1 96 string("is_file")
     >init is_file
0039 SEND_VAR #23.CV7($file) [rc1, rcn, string] -> #24.CV7($file) [rc1, rcn, string] 1 ; op1(string)
0040 #25.V8 [!false] = DO_ICALL
     >call is_file
0041 JMPZ #25.V8 [!false] 0044 ; op1(bool)
0044 INIT_FCALL 2 112 string("rtrim")
     >init rtrim
0045 SEND_VAR #4.CV4($prefix) [rc1, rcn, string] -> #26.CV4($prefix) NOVAL [rc1, rcn, string] 1 ; op1(string)
0046 SEND_VAL string("\\") 2
0047 #27.V8 [rc1, rcn, string] = DO_ICALL
     >call rtrim
0048 ASSIGN #26.CV4($prefix) NOVAL [rc1, rcn, string] -> #28.CV4($prefix) [rc1, rcn, string] #27.V8 [rc1, rcn, string] ; op1(string) op2(string)
---- TRACE 26 TSSA stop (link to 12)

The disassembly from the rtrim call to exit 9, annotated with my tracing observations, especially CV4($prefix) refcount:

    48004c8e:   movabsq $zif_rtrim, %rax
    48004c98:   callq *%rax
    48004c9a:   movabsq $EG(current_execute_data), %rax
    48004ca4:   movq %r14, (%rax)
    48004ca7:   cmpb $0, 0x59(%r15)
    48004cac:   je .L12
    48004cae:   movq 0x50(%r15), %rdi
    48004cb2:   subl $1, (%rdi)                            # CV4 refcount 2->1
    48004cb5:   jne .L12
    48004cb7:   movq $0x408ec860, (%r14)
    48004cbe:   movabsq $rc_dtor_func, %rax
    48004cc8:   callq *%rax
.L12:
    48004cca:   movabsq $EG(vm_stack_top), %rax
    48004cd4:   movq %r15, (%rax)
    48004cd7:   movabsq $EG(exception), %rax
    48004ce1:   movq (%rax), %rax
    48004ce4:   testq %rax, %rax
    48004ce7:   jne JIT$$icall_throw
    48004ced:   movabsq $EG(vm_interrupt), %rax
    48004cf7:   movb (%rax), %al
    48004cfa:   testb %al, %al
    48004cfd:   jne jit$$trace_exit_8                      # not taken
    48004d03:   cmpb $0, 0x99(%r14)
    48004d0b:   jne .L14                                   # taken
    48004d0d:   movq 0xd0(%r14), %rax
    48004d14:   movq %rax, 0x90(%r14)
    48004d1b:   movl 0xd8(%r14), %eax
    48004d22:   movl %eax, 0x98(%r14)
.L13:
    48004d29:   movabsq $EG(vm_interrupt), %rax
    48004d33:   movb (%rax), %al
    48004d36:   testb %al, %al
    48004d39:   je TRACE-12$AutoLoader::find$179+0x8
    48004d3f:   jmp jit$$trace_exit_9
.L14:
    48004d44:   movq 0x90(%r14), %rdi
    48004d4b:   movq 0xd0(%r14), %rax
    48004d52:   movq %rax, 0x90(%r14)
    48004d59:   movl 0xd8(%r14), %eax
    48004d60:   movl %eax, 0x98(%r14)
    48004d67:   subl $1, (%rdi)                            # CV4 refcount 1->0
    48004d6a:   jne .L13                                   # not taken
    48004d6c:   movq $0x408ec880, (%r14)
    48004d73:   movabsq $rc_dtor_func, %rax
    48004d7d:   callq *%rax                                # CV4 refcount 0->asan garbage
    48004d7f:   jmp .L13

So maybe the stack is corrupted before the exit is even taken.

dstogov commented 2 weeks ago

I run your mediamiki scenario with PHP compiled with address sanitiser and USE_ZEND_ALLOC=0. This leaded to the following error report.

==1071624==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000d6a6a0 at pc 0x000001dd1b81 bp 0x7ffedf716b40 sp 0x7ffedf716b38
READ of size 8 at 0x604000d6a6a0 thread T0
    #0 0x1dd1b80 in zif_strrpos /home/dmitry/php/php-master/ext/standard/string.c:2050
    #1 0x7f5d5160288d  (/anon_hugepage (deleted)+0x800288d)
    #2 0x26646c3 in zend_call_function /home/dmitry/php/php-master/Zend/zend_execute_API.c:994
    #3 0x2666000 in zend_call_known_function /home/dmitry/php/php-master/Zend/zend_execute_API.c:1088
    #4 0x1a68989 in spl_perform_autoload /home/dmitry/php/php-master/ext/spl/php_spl.c:442
    #5 0x2668074 in zend_lookup_class_ex /home/dmitry/php/php-master/Zend/zend_execute_API.c:1258
    #6 0x266a84f in zend_fetch_class_by_name /home/dmitry/php/php-master/Zend/zend_execute_API.c:1777
    #7 0x2720b5c in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER /home/dmitry/php/php-master/Zend/zend_vm_execute.h:7255
    #8 0x29853b1 in execute_ex /home/dmitry/php/php-master/Zend/zend_vm_execute.h:59190
    #9 0x299dcb3 in zend_execute /home/dmitry/php/php-master/Zend/zend_vm_execute.h:63873
    #10 0x2bf0ced in zend_execute_script /home/dmitry/php/php-master/Zend/zend.c:1913
    #11 0x219fb6a in php_execute_script_ex /home/dmitry/php/php-master/main/main.c:2576
    #12 0x21a0094 in php_execute_script /home/dmitry/php/php-master/main/main.c:2616
    #13 0x2c10d36 in main /home/dmitry/php/php-master/sapi/cgi/cgi_main.c:2544
    #14 0x7f5da7646149 in __libc_start_call_main (/usr/lib64/../lib64/libc.so.6+0x28149) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)
    #15 0x7f5da764620a in __libc_start_main_impl (/usr/lib64/../lib64/libc.so.6+0x2820a) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)
    #16 0x608b74 in _start (/home/dmitry/php/php-master/CGI-DEBUG-64/sapi/cgi/php-cgi+0x608b74) (BuildId: 66892c938ad497f5385e41fa102bd276da4771f6)

0x604000d6a6a0 is located 16 bytes inside of 40-byte region [0x604000d6a690,0x604000d6a6b8)
freed by thread T0 here:
    #0 0x7f5da8cd7fc8 in __interceptor_free.part.0 (/usr/lib64/../lib64/libasan.so.8+0xd7fc8) (BuildId: 2e1c50524ff1a2e7e73c4565b46f3f51892353ea)
    #1 0x24cb50f in __zend_free /home/dmitry/php/php-master/Zend/zend_alloc.c:3308
    #2 0x24c6950 in _efree /home/dmitry/php/php-master/Zend/zend_alloc.c:2747
    #3 0x2bc93ce in zend_string_destroy /home/dmitry/php/php-master/Zend/zend_variables.c:67
    #4 0x2bc90b9 in rc_dtor_func /home/dmitry/php/php-master/Zend/zend_variables.c:57
    #5 0x26742a8 in zend_assign_to_variable /home/dmitry/php/php-master/Zend/zend_execute.h:178
    #6 0x293f385 in ZEND_ASSIGN_SPEC_CV_TMP_RETVAL_UNUSED_HANDLER /home/dmitry/php/php-master/Zend/zend_vm_execute.h:49099
    #7 0x299ae95 in execute_ex /home/dmitry/php/php-master/Zend/zend_vm_execute.h:63317
    #8 0x26646c3 in zend_call_function /home/dmitry/php/php-master/Zend/zend_execute_API.c:994
    #9 0x2666000 in zend_call_known_function /home/dmitry/php/php-master/Zend/zend_execute_API.c:1088
    #10 0x1a68989 in spl_perform_autoload /home/dmitry/php/php-master/ext/spl/php_spl.c:442
    #11 0x2668074 in zend_lookup_class_ex /home/dmitry/php/php-master/Zend/zend_execute_API.c:1258
    #12 0x266a84f in zend_fetch_class_by_name /home/dmitry/php/php-master/Zend/zend_execute_API.c:1777
    #13 0x2720b5c in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER /home/dmitry/php/php-master/Zend/zend_vm_execute.h:7255
    #14 0x29853b1 in execute_ex /home/dmitry/php/php-master/Zend/zend_vm_execute.h:59190
    #15 0x299dcb3 in zend_execute /home/dmitry/php/php-master/Zend/zend_vm_execute.h:63873
    #16 0x2bf0ced in zend_execute_script /home/dmitry/php/php-master/Zend/zend.c:1913
    #17 0x219fb6a in php_execute_script_ex /home/dmitry/php/php-master/main/main.c:2576
    #18 0x21a0094 in php_execute_script /home/dmitry/php/php-master/main/main.c:2616
    #19 0x2c10d36 in main /home/dmitry/php/php-master/sapi/cgi/cgi_main.c:2544
    #20 0x7f5da7646149 in __libc_start_call_main (/usr/lib64/../lib64/libc.so.6+0x28149) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)
    #21 0x7f5da764620a in __libc_start_main_impl (/usr/lib64/../lib64/libc.so.6+0x2820a) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)
    #22 0x608b74 in _start (/home/dmitry/php/php-master/CGI-DEBUG-64/sapi/cgi/php-cgi+0x608b74) (BuildId: 66892c938ad497f5385e41fa102bd276da4771f6)

previously allocated by thread T0 here:
    #0 0x7f5da8cd92ff in malloc (/usr/lib64/../lib64/libasan.so.8+0xd92ff) (BuildId: 2e1c50524ff1a2e7e73c4565b46f3f51892353ea)
    #1 0x24cb3c2 in __zend_malloc /home/dmitry/php/php-master/Zend/zend_alloc.c:3280
    #2 0x24c680c in _emalloc /home/dmitry/php/php-master/Zend/zend_alloc.c:2737
    #3 0x1da97a1 in zend_string_alloc /home/dmitry/php/php-master/Zend/zend_string.h:176
    #4 0x1da9c1d in zend_string_init /home/dmitry/php/php-master/Zend/zend_string.h:198
    #5 0x1db5624 in php_trim_int /home/dmitry/php/php-master/ext/standard/string.c:618
    #6 0x1db5d23 in php_do_trim /home/dmitry/php/php-master/ext/standard/string.c:649
    #7 0x1db701c in zif_rtrim /home/dmitry/php/php-master/ext/standard/string.c:691
    #8 0x7f5d51606b5d  (/anon_hugepage (deleted)+0x8006b5d)
    #9 0x26646c3 in zend_call_function /home/dmitry/php/php-master/Zend/zend_execute_API.c:994
    #10 0x2666000 in zend_call_known_function /home/dmitry/php/php-master/Zend/zend_execute_API.c:1088
    #11 0x1a68989 in spl_perform_autoload /home/dmitry/php/php-master/ext/spl/php_spl.c:442
    #12 0x2668074 in zend_lookup_class_ex /home/dmitry/php/php-master/Zend/zend_execute_API.c:1258
    #13 0x266a84f in zend_fetch_class_by_name /home/dmitry/php/php-master/Zend/zend_execute_API.c:1777
    #14 0x2720b5c in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER /home/dmitry/php/php-master/Zend/zend_vm_execute.h:7255
    #15 0x29853b1 in execute_ex /home/dmitry/php/php-master/Zend/zend_vm_execute.h:59190
    #16 0x299dcb3 in zend_execute /home/dmitry/php/php-master/Zend/zend_vm_execute.h:63873
    #17 0x2bf0ced in zend_execute_script /home/dmitry/php/php-master/Zend/zend.c:1913
    #18 0x219fb6a in php_execute_script_ex /home/dmitry/php/php-master/main/main.c:2576
    #19 0x21a0094 in php_execute_script /home/dmitry/php/php-master/main/main.c:2616
    #20 0x2c10d36 in main /home/dmitry/php/php-master/sapi/cgi/cgi_main.c:2544
    #21 0x7f5da7646149 in __libc_start_call_main (/usr/lib64/../lib64/libc.so.6+0x28149) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)
    #22 0x7f5da764620a in __libc_start_main_impl (/usr/lib64/../lib64/libc.so.6+0x2820a) (BuildId: 0d710e9d9dc10c500b8119c85da75004183618e2)
    #23 0x608b74 in _start (/home/dmitry/php/php-master/CGI-DEBUG-64/sapi/cgi/php-cgi+0x608b74) (BuildId: 66892c938ad497f5385e41fa102bd276da4771f6)

As the problem occurs only with Excimer and racing JIT, I guess it may be caused by some incorrect deoptimization triggered by vm_interrupt that missed some reference-counting . (This is not necessary. May be interrupts just lead to different traces and disclose some other hidden bug).

I know how to track reference-counting of some variable in a deterministic application. This is not simple but possible. Unfortunately, Excimer make the behavir non-deterministic (on each run I see different traces).

I'll think how to analyse the problem deeper and/or simplify the reproduction case.

dstogov commented 2 weeks ago

Now I understood the problem. Actually, you found it your self and tried to explain it to me, but I didn't understand. Sorry.

Deoptimization info used for interrupt handler in "link_to_root" point caused to pass execution to invalid opline. We should use the first opline of the root trace.

The fix is simple. It need to be backported to PHP-8.2. I'll do it later.

diff --git a/ext/opcache/jit/zend_jit_trace.c b/ext/opcache/jit/zend_jit_trace.c
index c7cd330970d..69187ca3814 100644
--- a/ext/opcache/jit/zend_jit_trace.c
+++ b/ext/opcache/jit/zend_jit_trace.c
@@ -7159,7 +7159,7 @@ static const void *zend_jit_trace(zend_jit_trace_rec *trace_buffer, uint32_t par
                    for (i = 0; i < op_array->last_var + op_array->T; i++) {
                        SET_STACK_TYPE(stack, i, IS_UNKNOWN, 1);
                    }
-                   exit_point = zend_jit_trace_get_exit_point(opline, ZEND_JIT_EXIT_TO_VM);
+                   exit_point = zend_jit_trace_get_exit_point(zend_jit_traces[t->link].opline, ZEND_JIT_EXIT_TO_VM);
                    timeout_exit_addr = zend_jit_trace_get_exit_addr(exit_point);
                    if (!timeout_exit_addr) {
                        goto jit_failure;

Please test this also.

dstogov commented 2 weeks ago

Just to show how this was analysed see the dump (jit_debug=0x1ff001) with "<--" comments:


---- TRACE 35 TSSA start (side trace 12/6) AutoLoader::find() /home/dmitry/php/mediawiki-1.42.1/includes/AutoLoader.php:188
     ;#0.CV0($className) [rc1, rcn, any]
     ;#4.CV4($prefix) [rc1, rcn, string]
     ;#5.CV5($pos) [long]
     ;#6.CV6($relativeClass) NOVAL [undef, rc1, rcn, string]
     ;#7.CV7($file) NOVAL [undef, rc1, rcn, string]
0027 #12.T9 [long, double] = ADD #5.CV5($pos) [long] int(1) ; op1(int)
0028 #14.T8 [rc1, rcn, string] = FRAMELESS_ICALL_2(substr) #0.CV0($className) [rc1, rcn, any] -> #13.CV0($className) NOVAL [rc1, rcn, any] #12.T9 [long, double] ; op1(string) op2(int)
0029 ASSIGN #6.CV6($relativeClass) NOVAL [undef, rc1, rcn, string] -> #15.CV6($relativeClass) [rc1, rcn, string] #14.T8 [rc1, rcn, string] ; op1(undef) op2(string)
0030 #16.T8 [!rcn, array of [any, ref]] = FETCH_STATIC_PROP_R string("psr4Namespaces") (self) (exception)
0031 #17.T9 [!rcn, string] = FETCH_DIM_R #16.T8 [!rcn, array of [any, ref]] #4.CV4($prefix) [rc1, rcn, string] ; op1(array) op2(string) val(string)
0032 #18.T8 [rc1, rcn, string] = CONCAT #17.T9 [!rcn, string] string("/") ; op1(string)
0033 #20.T10 [rc1, rcn, string] = FRAMELESS_ICALL_3(strtr) #15.CV6($relativeClass) [rc1, rcn, string] -> #19.CV6($relativeClass) NOVAL [rc1, rcn, string] string("\\") ; op1(string)
0034 ;OP_DATA string("/")
0035 #21.T9 [rc1, rcn, string] = FAST_CONCAT #18.T8 [rc1, rcn, string] #20.T10 [rc1, rcn, string] ; op1(string) op2(string)
0036 #22.T8 [rc1, rcn, string] = FAST_CONCAT #21.T9 [rc1, rcn, string] string(".php") ; op1(string)
0037 ASSIGN #7.CV7($file) NOVAL [undef, rc1, rcn, string] -> #23.CV7($file) [rc1, rcn, string] #22.T8 [rc1, rcn, string] ; op1(undef) op2(string)
0038 INIT_FCALL 1 96 string("is_file")
     >init is_file
0039 SEND_VAR #23.CV7($file) [rc1, rcn, string] -> #24.CV7($file) [rc1, rcn, string] 1 ; op1(string)
0040 #25.V8 [!false] = DO_ICALL
     >call is_file
0041 JMPZ #25.V8 [!false] 0044 ; op1(bool)
0044 INIT_FCALL 2 112 string("rtrim")
     >init rtrim
0045 SEND_VAR #4.CV4($prefix) [rc1, rcn, string] -> #26.CV4($prefix) NOVAL [rc1, rcn, string] 1 ; op1(string)
0046 SEND_VAL string("\\") 2
0047 #27.V8 [rc1, rcn, string] = DO_ICALL
     >call rtrim
0048 ASSIGN #26.CV4($prefix) NOVAL [rc1, rcn, string] -> #28.CV4($prefix) [rc1, rcn, string] #27.V8 [rc1, rcn, string] ; op1(string) op2(string)
---- TRACE 35 TSSA stop (link to 12)
TRACE-35$AutoLoader::find$188:
    ...
    7f973e406915:   movq $zif_rtrim, %rax
    7f973e40691c:   callq *%rax
    7f973e40691e:   movq %r14, EG(current_execute_data)
    7f973e406926:   cmpb $0, 0x59(%r15)
    7f973e40692b:   je .L12
    7f973e40692d:   movq 0x50(%r15), %rdi
    7f973e406931:   subl $1, (%rdi)
    7f973e406934:   jne .L12
    7f973e406936:   leaq -0x71d3955(%rip), %rax
    7f973e40693d:   movq %rax, (%r14)
    7f973e406940:   movq $rc_dtor_func, %rax
    7f973e406947:   callq *%rax
.L12:
    7f973e406949:   movq %r15, EG(vm_stack_top)
    7f973e406951:   cmpq $0, EG(exception)
    7f973e40695a:   jne JIT$$icall_throw
    7f973e406960:   cmpb $0, EG(vm_interrupt)
    7f973e406968:   jne jit$$trace_exit_8
    7f973e40696e:   cmpb $0, 0x99(%r14)                               <-- this is JIT for 0048 ASSIGN
    7f973e406976:   jne .L14
    7f973e406978:   movq 0xd0(%r14), %rax
    7f973e40697f:   movq %rax, 0x90(%r14)
    7f973e406986:   movl 0xd8(%r14), %eax
    7f973e40698d:   movl %eax, 0x98(%r14)
.L13:
    7f973e406994:   cmpb $0, EG(vm_interrupt)
    7f973e40699c:   je TRACE-12$AutoLoader::find$184+0x4
    7f973e4069a2:   jmp jit$$trace_exit_9                            <-- something goes wrong here ???
    ...
---- TRACE 35 exit info
     exit_0: 0031/0000/7 CV4($prefix):string CV5($pos):int CV6($relativeClass):string
     exit_1: 0031/0007/9/VM CV4($prefix):string CV5($pos):int CV6($relativeClass):string X8:array
     exit_2: 0032/0016/10/FREE_OP1 CV4($prefix):string CV5($pos):int CV6($relativeClass):string X8:array X9:unknown(zval_copy(rax))
     exit_3: 0032/0026/10/FREE_OP1 CV4($prefix):string CV5($pos):int CV6($relativeClass):string X8:array X9:unknown(zval_copy(rax))
     exit_4: 0038/0036/11/VM CV4($prefix):string CV5($pos):int CV6($relativeClass):string CV7($file):string X8:string X9:string X10:string
     exit_5: 0041/0047/11/VM CV4($prefix):string CV5($pos):int CV6($relativeClass):string CV7($file):string X8:string X9:string X10:string
     exit_6: 0041/0058/11 CV4($prefix):string CV5($pos):int CV6($relativeClass):string CV7($file):string X9:string X10:string
     exit_7: 0044/0069/11/VM CV4($prefix):string CV5($pos):int CV6($relativeClass):string CV7($file):string X8:bool X9:string X10:string
     exit_8: 0048/0080/11/VM CV4($prefix):string CV5($pos):int CV6($relativeClass):string CV7($file):string X8:bool X9:string X10:string
     exit_9: 0048/----/0/VM                                               <-- 0048 is wrong, must be 0049 !!!
---- TRACE 35 compiled
tstarling commented 2 weeks ago

I tested it, fix confirmed. Thanks @dstogov.