Open vityank opened 2 years ago
My further investigation into the issue for now(After expanding the EX and CACHED_PTR macros), points to use after free like issue with (execute_data)->run_time_cache storing a no more valid address:
(gdb) print ((execute_data)->run_time_cache) $30 = (void ) 0x7f5ba20237c0 (gdb) print ((void)((char)((execute_data)->run_time_cache) + (opline->extended_value))) $31 = (void ) 0x7f5ba20237c0 (gdb) print ((void)((char)((execute_data)->run_time_cache) + (opline->extended_value)))[0] Cannot access memory at address 0x7f5ba20237c0
Other found facts: The problem does not occur if opcache.consistency_checks is set to any value other than 0. opcache.preferred_memory_model setting have not any effect. Setting opcache.protect_memory to 1, allows to reproduce the crash also on CLI, through in other place @ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER.
I arranged to make small testcase which allows to reproduce part of it with opcache.protect_memory=1. May-be it will reveal the cause of the instability in the optimizer generated code.
@cmb69 , I hope you'll be able to reproduce the crash with it without extra php.ini changes(Except of mentioned one). Looks like only 8.0.x are affected(8.1 is not).
The just released 8.0.25 is still affected(It was expected from the changelog and silence here). It seems like to be fallen under the hood, so @cmb69 or may be @nikic, I'll be glad if you find some time to look at this one(Reproduce script is in previous message).
I can reproduce with https://github.com/php/php-src/files/9763190/testConstOptimizerBug.zip and USE_ZEND_ALLOC=1 gdb -args php -d zend_extension=opcache -d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.enable=1 testConstOptimizerBug.php
in the latest php 8.0.
The commit bd98d84e573914c7b7560dea06632e7e7b57ffb5 Reorder conditions and always mark methods in SHM as ZEND_ACC_IMMUTABLE
possibly seems relevant to why this would be fixed in php 8.1, but that's a large guess, there's many other things it could be and I'm only a bit familiar with this code
I'm not familiar with the policy here - but backporting the patch might affect extensions (performance monitoring tools, debuggers/zend_extensions replacing the interpreter such as xdebug) that are affected by internal implementation details of the php compiler and op arrays
If backporting the patch causes (or exposes new bugs) there won't be another bug fix release to fix those.
(gdb) run
Starting program: /path/to/php-8.0.26-debug-install/bin/php -d zend_extension=opcache -d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.enable=1 testConstOptimizerBug.php
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x0000555555974338 in init_func_run_time_cache_i (op_array=0x408bf010) at /path/to/php-src/Zend/zend_execute.c:3678
3678 ZEND_MAP_PTR_SET(op_array->run_time_cache, run_time_cache);
(gdb) bt
#0 0x0000555555974338 in init_func_run_time_cache_i (op_array=0x408bf010) at /path/to/php-src/Zend/zend_execute.c:3678
#1 0x000055555597435a in init_func_run_time_cache (op_array=0x408bf010) at /path/to/php-src/Zend/zend_execute.c:3684
#2 0x0000555555987328 in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER () at /path/to/php-src/Zend/zend_vm_execute.h:6671
#3 0x00005555559ed18c in execute_ex (ex=0x7ffff7a14020) at /path/to/php-src/Zend/zend_vm_execute.h:55756
#4 0x00005555559f1b7b in zend_execute (op_array=0x7ffff7a5d3c0, return_value=0x0) at /path/to/php-src/Zend/zend_vm_execute.h:59523
#5 0x0000555555940502 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /path/to/php-src/Zend/zend.c:1694
#6 0x00005555558a13d0 in php_execute_script (primary_file=0x7fffffffc790) at /path/to/php-src/main/main.c:2545
#7 0x0000555555a32cb0 in do_cli (argc=10, argv=0x555556221b20) at /path/to/php-src/sapi/cli/php_cli.c:949
#8 0x0000555555a33d0c in main (argc=10, argv=0x555556221b20) at /path/to/php-src/sapi/cli/php_cli.c:1337
(gdb) print op_array->run_time_cache__ptr
$5 = (void ***) 0x408bf0f8
(gdb) set {int}0x408bf0f8=0
Cannot access memory at address 0x408bf0f8
(gdb) print *(op_array->run_time_cache__ptr)
$6 = (void **) 0x0
https://www.php.net/supported-versions.php
php 8.0 has active bug fix support until 26 Nov 2022 | in 17 days
- and I'd personally consider this a bug fix rather than a security fix (the end user wrote the code that used late static binding, not an attacker)
php 8.0 has active bug fix support until
26 Nov 2022 | in 17 days
Right. While PHP 8.0.26 will be a regular bug fix release, 8.0.26RC1 is supposed to be tagged today, so this is likely to late for this issue to be fixed. I suggest to close the ticket as WONTFIX; those who are affected by this issue should better update to PHP 8.1.
https://www.npopov.com/2021/10/13/How-opcache-works.html#map-pointers
I also see that in php 8.1, this example is pointing into immutable memory through an offset (map_ptr & 1) == 1
, and in php 8.0, it was a pointer into read-only memory (hadn't checked whether that is shared memory or a corrupted pointer, though I suspect shared memory with it only crashing with opcache.protect_memory=1)
For mutable memory: map_ptr & 1 == 0
map pointer ----> indirection pointer -----> static variables
(arena allocated)
For immutable memory: map_ptr & 1 == 1
map base pointer: slot 0
slot 1
+ map offset: slot 2 -----> static variables
slot 3
(gdb) print op_array->run_time_cache__ptr
$1 = (void ***) 0x791
(gdb) bt
#0 init_func_run_time_cache_i (op_array=0x408be830) at /path/to/php-src/Zend/zend_execute.c:3948
#1 0x0000555555d264c1 in init_func_run_time_cache (op_array=0x408be830) at /path/to/php-src/Zend/zend_execute.c:3956
#2 0x0000555555d39946 in ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER () at /path/to/php-src/Zend/zend_vm_execute.h:6846
#3 0x0000555555da1cca in execute_ex (ex=0x5555571353c0) at /path/to/php-src/Zend/zend_vm_execute.h:56351
#4 0x0000555555da66db in zend_execute (op_array=0x5555570d9270, return_value=0x0) at /path/to/php-src/Zend/zend_vm_execute.h:60123
#5 0x0000555555cef0c3 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /path/to/php-src/Zend/zend.c:1813
#6 0x0000555555c4bda1 in php_execute_script (primary_file=0x7fffffffc780) at /path/to/php-src/main/main.c:2539
#7 0x0000555555e63f34 in do_cli (argc=10, argv=0x555556e69e60) at /path/to/php-src/sapi/cli/php_cli.c:965
#8 0x0000555555e6503c in main (argc=10, argv=0x555556e69e60) at /path/to/php-src/sapi/cli/php_cli.c:1367
Right. While PHP 8.0.26 will be a regular bug fix release, 8.0.26RC1 is supposed to be tagged today, so this is likely to late for this issue to be fixed. I suggest to close the ticket as WONTFIX; those who are affected by this issue should better update to PHP 8.1.
I forgot about the RC builds. Agreed, even if a fix was ready by then I don't expect that reviewers would be confident enough to approve it
I wasn't sure of this from the original ticket - was that a segfault that happened some of the time or all of the time?
I checked and found that the crash (with opcache.protect_memory=1) also occurs in php 7.4 (tested with 7.4.31-dev), which stopped receiving bug fix support 11 months ago - https://www.php.net/supported-versions.php (same stack trace of init_func_run_time_cache and ZEND_INIT_STATIC_METHOD_CALL_SPEC_CONST_CONST_HANDLER in gdb)
gdb -args php --no-php-ini -d zend_extension=opcache.so -d opcache.protect_memory=1 -d opcache.enable_cli=1 -d opcache.enable=1 ~/Downloads/php-issue-9396-segfault/testConstOptimizerBug.php
(the crash with protect_memory=1 is something I suspect is a symptom of a race condition that would possibly cause memory corruption in php 7.4 and 8.0 for late static binding)
Looking at zend_persist_class_method , it just seems wrong. For the class in php 8.0, it has ZCG(is_immutable_class) == 0 (from late static binding?)
I added debugging code, and it's initializing run_time_cache__ptr to a pointer within the shared memory arena, rather than to the offset corresponding to that pointer (ZEND_MAP_PTR_INIT(op_array->run_time_cache, ZCG(arena_mem));)
Changing to an offset might fix that, but I'm not clear on why the non-immutable class case was using arena_mem in the first place
if (ZCG(is_immutable_class)) {
op_array->fn_flags |= ZEND_ACC_IMMUTABLE;
ZEND_MAP_PTR_NEW(op_array->run_time_cache);
fprintf(stderr, "imm run_time_cache=%p\n", op_array->run_time_cache__ptr);
if (op_array->static_variables) {
ZEND_MAP_PTR_NEW(op_array->static_variables_ptr);
}
} else {
ZEND_MAP_PTR_INIT(op_array->run_time_cache, ZCG(arena_mem));
fprintf(stderr, "arena_mem=%p run_time_cache=%p\n", ZCG(arena_mem), op_array->run_time_cache__ptr);
ZCG(arena_mem) = (void*)(((char*)ZCG(arena_mem)) + ZEND_ALIGNED_SIZE(sizeof(void*)));
ZEND_MAP_PTR_SET(op_array->run_time_cache, NULL);
}
I have to wonder if something is causing the arena_mem to point into shared memory (zend_shared_alloc) instead of per-request(emalloced) memory in php 8.0. E.g. per-request would be the emalloced arena from zend_arena_alloc ZCG(arena_mem) = zend_arena_alloc(&CG(arena), persistent_script->arena_size);
-
Possibly something to do with inheritance, since zend_accel_inheritance_cache_add changes ZCG(mem) to point into shared memory rather than per-request memory - the inheritance cache was changed in php 8.1, so I don't know if it no longer happens for all cases or just in the specific case
ext/opcache/ZendAccelerator.c
ZCG(mem) = zend_shared_alloc(memory_used + 64);
ext/opcache/zend_persist.c
script->arena_mem = ZCG(arena_mem) = ZCG(mem);
Anyway, my best guess as to why this crashes (in 7.4 and 8.0) is that the inheritance code causes the memory to be shared memory instead of per-request memory when this is compiled:
A
of fpm uses the run_time_cache__ptr, it writes *(op_array->run_time_cache__ptr) = ADDRESS_IN_LOCAL_MEMORY_OF_PROCESS_A
- this is what opcache.protect_memory=1
is properly catchingB
of fpm uses the run_time_cacheptr, it appears as if op_array->run_time_cache__ptr was initialized because it's non-null. But it's improperly set up because op_array->run_time_cacheptr was an address in shared memory, which shouldn't happen but did due to this bug. So it tries to access a memory that would be valid in process A's memory but is either pointing to the wrong data (and has undefined behavior) or preferably crashes quickly before it can misbehaveTysonAndre Oh, great breakdown here on the issue internals and possible causes.
I wasn't sure of this from the original ticket - was that a segfault that happened some of the time or all of the time?
When running with our standard production configuration with Opcache(Ofc, w/o the opcache.protect_memory=1) and PHP-FPM it starts on up-to n-th, there n is number of PHP-FPM child processes, and then crashes constantly on any following request, and it's probably matches your conclusion here:
When process A of fpm uses the run_time_cache__ptr, it writes *(op_array->run_time_cache__ptr) = ADDRESS_IN_LOCAL_MEMORY_OF_PROCESS_A - this is what opcache.protect_memory=1 is properly catching
I checked and found that the crash (with opcache.protect_memory=1) also occurs in php 7.4 (tested with 7.4.31-dev)
This is indeed very interesting find. I didn't test it with our PHP 7.4 binaries(Which I stopped updating since we targeted 8.0 as upgrade target from our mainline 7.3), and was almost sure it was PHP 8.0 regression... May-be analyzing optimizer differences between 7.3 and 7.4 will put some light on it, and find the original change which broke it, and possibly fix it w/o changing allocation targets from SHM to heap, and similar large and undesirable changes to the engine.
Thanks.
cmb69
those who are affected by this issue should better update to PHP 8.1.
If only it was that easy... PHP version upgrades are a long complicated process in a business. We are migrating from 7.3 to 8.0 for the whole year now, with several types of servers running just fine with it. However during migration of one of servers which uses wider part of our codebase(Including inherited classes with late static binding) we started to receive lots of failures and saw tons of php-fpm process segfaults in the dmesg, resulting, ofc, in immediate revert to 'stable-and-proven' PHP 7.3 on this machine and halt of the migration project until further notice. It got through internal dev QA as most parts there pretested from CLI which has OPCache deactivated...
Anyway, I quite understand the release process and that PHP 8.0 reaches its end of bug fix support quite soon. As this issue is deep into Zend engine, optimizer and FPM internals I would not be able to fix it myself or backport it from newer versions(As PHP 8.1 is already not affected).
I am still having WordPress sites hang on segfault on PHP-fpm. This occurs after the site has been running for a while. I have run the sites on PHP 8.0, 8.1, 8.2 and 8.3. None stay up but 8.2 throws the error sooner than 8.3.
Description
PHP-FPM crashes then OPCache enabled(Even if I disable all low 16bits of optimization flags in opcache.optimization_level) with some pattern of Late static binding involved.
Unfortunately I can't create minimal working test case(I tried my best). The only thing I know is that pattern like this causes it in the end(Note the constant having initial value via 'self' and later used as LSB via 'static'):
Segmentation fault info:
PHP Version
PHP 8.0.22/8.0.23/8.024
8.1 tree seems to not be affected(Tested on 8.1.11).
Operating System
CentOS 7