php / php-src

The PHP Interpreter
https://www.php.net
Other
38.17k stars 7.75k forks source link

JIT segmentation fault in PHP 8.1 #7817

Open cappadaan opened 2 years ago

cappadaan commented 2 years ago

Description

PHP 8.1.0 + 8.1.1 produces segfault, randomly. Downgrading to 8.0 solves the issue.

--core dump---

BFD: Warning: coredump-php-fpm.30267 is truncated: expected core file size >= 5413076992, found: 35983360. [New LWP 30267] [New LWP 1887] [New LWP 1886] [New LWP 1888] Cannot access memory at address 0x7f277dbb3128 Cannot access memory at address 0x7f277dbb3120 Failed to read a valid object file image from memory. Core was generated by `php-fpm: pool xxxxxx '.

Program terminated with signal 11, Segmentation fault.

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

10137 ce = CACHED_PTR(opline->op2.num); (gdb) bt Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x7ffdc940e3a8: (gdb) bt

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

Cannot access memory at address 0x7ffdc940e3a8 (gdb) frame 0

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

10137 ce = CACHED_PTR(opline->op2.num); (gdb) info frame Stack level 0, frame at 0x7ffdc940e3b0: rip = 0x55bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER (/usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137); saved rip Cannot access memory at address 0x7ffdc940e3a8

this is the only available info in the core dump.

PHP Version

PHP 8.1.0 + 8.1.1

Operating System

CentOS 7

etu commented 2 years ago

Also adding myself to the list for notifications, had a microservice yesterday that didn't function properly. When I read the logs for fpm it spammed traces like this over and over again:

[servicename-production-xxxxxxxxxx-yyyyy php] [15-Sep-2022 14:11:29] WARNING: [pool www] child 2575 exited on signal 11 (SIGSEGV - core dumped) after 0.378914 seconds from start

So I restarted the deployment and it's been "fine" since. However it's not the first time we've observed this and it seems to be very random when and in which service it happens for us.

But to add something to the conversation I have now logged into the new deployment to pull some details from the same container image as the one that went wrong, but restarted.

PHP Version:

~ $ php -v
PHP 8.1.10 (cli) (built: Sep  1 2022 21:43:31) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.10, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.10, Copyright (c), by Zend Technologies
    with ddtrace v0.78.0, Copyright Datadog, by Datadog
    with ddappsec v0.4.0, Copyright Datadog, by Datadog

~ $ php-fpm -v
PHP 8.1.10 (fpm-fcgi) (built: Sep  1 2022 21:43:35)
Copyright (c) The PHP Group
Zend Engine v4.1.10, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.10, Copyright (c), by Zend Technologies
    with ddtrace v0.78.0, Copyright Datadog, by Datadog
    with ddappsec v0.4.0, Copyright Datadog, by Datadog

The container is built on php:8.1-fpm-alpine3.16 from dockerhub but adds some extensions and php configuration options. Some with a sharp eye may see the datadog extensions, they do work fine when PHP is working fine, in these cases we get nothing about it in datadog at all.

However I can see from my monitoring that it was just one of the pods that had the issue, the others running was just fine.

~ $ php-fpm -i | grep -i opcache | grep jit
opcache.jit => tracing => tracing
opcache.jit_bisect_limit => 0 => 0
opcache.jit_blacklist_root_trace => 16 => 16
opcache.jit_blacklist_side_trace => 8 => 8
opcache.jit_buffer_size => 50M => 50M
opcache.jit_debug => 0 => 0
opcache.jit_hot_func => 127 => 127
opcache.jit_hot_loop => 64 => 64
opcache.jit_hot_return => 8 => 8
opcache.jit_hot_side_exit => 8 => 8
opcache.jit_max_exit_counters => 8192 => 8192
opcache.jit_max_loop_unrolls => 8 => 8
opcache.jit_max_polymorphic_calls => 2 => 2
opcache.jit_max_recursive_calls => 2 => 2
opcache.jit_max_recursive_returns => 2 => 2
opcache.jit_max_root_traces => 1024 => 1024
opcache.jit_max_side_traces => 128 => 128
opcache.jit_prof_threshold => 0.005 => 0.005
theCalcaholic commented 2 years ago

Opcache segfaults still happen in 8.1.10. I'm running a Nextcloud server which is affected.

It happens, for example, when I run php /var/www/nextcloud/occ status - despite having set opcache.jit and opcache.jit_buffer_size to 0 (some crashes could be avoided this way, though - but not all of them).

PHP Version:

$ php --version
PHP 8.1.10 (cli) (built: Sep 14 2022 10:31:35) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.10, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.10, Copyright (c), by Zend Technologies

EDIT: After cleaning the opcache file cache (rm -rf /path/to/opcache/*), the segfaults whent away!

javer commented 2 years ago

I have a closed-source project and a stable reproducing flow for the crash.

The project is built using Symfony 6.1, Doctrine, NelmioApiDocBundle and so on. It has a lot of API endpoints and a lot of models with a lot of attributes if it matters.

The issue can be reproduced in dev environment with the following script:

#!/bin/sh
while true; do
  rm -rf var/cache/dev
  for i in `seq 1 2`; do
    curl http://127.0.0.1:8080/generate_api_doc >/dev/null || exit
  done
done

In about 1 minute I get the guaranteed crash. So the key to successfully reproduce the crash is to completely remove application cache and regenerate it again between requests. The new cache contains the same filenames but with the different content, actually files differ only by cache generation timestamp inside the files.

My observations during attempt to decrease the project size trying to narrow down the potential place of the error:

I've seen the following assert errors during tests:

Frequent places where crash happens:

PHP 8.1.10 with the default configuration (i.e. no php.ini at all) except:

zend_extension=opcache.so
memory_limit = 256M
date.timezone = UTC
opcache.enable = 1
opcache.enable_cli = 1
opcache.memory_consumption = 256
opcache.max_accelerated_files = 50000
opcache.jit_buffer_size = 256M

Let me know how I can help you without sharing the source code of the project.

25 detailed stacktraces with Dockerfile to build the exact version of PHP can be found here: https://gist.github.com/javer/0baae3d5f6113faf47ccd0d875a5e9a3

oleg-st commented 2 years ago

@javer Could you turn off the inheritance cache and try to do your tests again?

you will need to comment out these 2 lines https://github.com/php/php-src/blob/PHP-8.1/ext/opcache/ZendAccelerator.c#L3342

zend_inheritance_cache_get = zend_accel_inheritance_cache_get;
zend_inheritance_cache_add = zend_accel_inheritance_cache_add;
javer commented 2 years ago

@oleg-st Unfortunately it didn't help, php crashed on 15th request:

Program received signal SIGSEGV, Segmentation fault.
0x0000aaaa7e09ab28 in ?? ()
(gdb) bt
#0  0x0000aaaa7e09ab28 in ?? ()
#1  0x0000ffff8b26d108 in ?? ()
#2  0x0000ffff9b813020 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(gdb) p (char*)executor_globals.current_execute_data.func.op_array.filename.val
$1 = 0xffff7e084a08 "/var/www/vendor/symfony/config/Definition/ArrayNode.php"

(gdb) p (char*)executor_globals.current_execute_data.func.op_array.function_name.val
$2 = 0xffff7e0957a8 "finalizeValue"

(gdb) p executor_globals.current_execute_data.func.op_array.line_start
$3 = 211

(gdb) p executor_globals.current_execute_data.opline.lineno
$4 = 245

(gdb) p executor_globals.current_execute_data.opline - executor_globals.current_execute_data.func.op_array.opcodes
$5 = 89

(gdb) disassemble 0x0000aaaa7e09ab28-100,0x0000aaaa7e09ab28+100
Dump of assembler code from 0xaaaa7e09aac4 to 0xaaaa7e09ab8c:
   0x0000aaaa7e09aac4:  Cannot access memory at address 0xaaaa7e09aac4

(gdb) disassemble 0x0000ffff8b26d108-100,0x0000ffff8b26d108+100
Dump of assembler code from 0xffff8b26d0a4 to 0xffff8b26d16c:
   0x0000ffff8b26d0a4:  cmp w15, #0xa
   0x0000ffff8b26d0a8:  b.ne    0xffff8b26d0bc  // b.any
   0x0000ffff8b26d0ac:  ldrb    w15, [x0, #17]
   0x0000ffff8b26d0b0:  tst w15, #0x2
   0x0000ffff8b26d0b4:  b.eq    0xffff8b26ce38  // b.none
   0x0000ffff8b26d0b8:  ldr x0, [x0, #8]
   0x0000ffff8b26d0bc:  ldr w15, [x0, #4]
   0x0000ffff8b26d0c0:  mov w16, #0xfc10                    // #64528
   0x0000ffff8b26d0c4:  movk    w16, #0xffff, lsl #16
   0x0000ffff8b26d0c8:  tst w15, w16
   0x0000ffff8b26d0cc:  b.ne    0xffff8b26ce38  // b.any
   0x0000ffff8b26d0d0:  mov x15, #0xc1c                     // #3100
   0x0000ffff8b26d0d4:  movk    x15, #0xbfce, lsl #16
   0x0000ffff8b26d0d8:  movk    x15, #0xaaaa, lsl #32
   0x0000ffff8b26d0dc:  blr x15
   0x0000ffff8b26d0e0:  b   0xffff8b26ce38
   0x0000ffff8b26d0e4:  ldr x0, [x27, #144]
   0x0000ffff8b26d0e8:  ldr w15, [x0]
   0x0000ffff8b26d0ec:  subs    w15, w15, #0x1
   0x0000ffff8b26d0f0:  str w15, [x0]
   0x0000ffff8b26d0f4:  b.ne    0xffff8b26d10c  // b.any
   0x0000ffff8b26d0f8:  mov x15, #0xe70c                    // #59148
   0x0000ffff8b26d0fc:  movk    x15, #0xbfbf, lsl #16
   0x0000ffff8b26d100:  movk    x15, #0xaaaa, lsl #32
   0x0000ffff8b26d104:  blr x15
=> 0x0000ffff8b26d108:  b   0xffff8b26ce44
   0x0000ffff8b26d10c:  ldrb    w15, [x27, #152]
   0x0000ffff8b26d110:  cmp w15, #0xa
   0x0000ffff8b26d114:  b.ne    0xffff8b26d128  // b.any
   0x0000ffff8b26d118:  ldrb    w15, [x0, #17]
   0x0000ffff8b26d11c:  tst w15, #0x2
   0x0000ffff8b26d120:  b.eq    0xffff8b26ce44  // b.none
   0x0000ffff8b26d124:  ldr x0, [x0, #8]
   0x0000ffff8b26d128:  ldr w15, [x0, #4]
   0x0000ffff8b26d12c:  mov w16, #0xfc10                    // #64528
   0x0000ffff8b26d130:  movk    w16, #0xffff, lsl #16
   0x0000ffff8b26d134:  tst w15, w16
   0x0000ffff8b26d138:  b.ne    0xffff8b26ce44  // b.any
   0x0000ffff8b26d13c:  mov x15, #0xc1c                     // #3100
   0x0000ffff8b26d140:  movk    x15, #0xbfce, lsl #16
   0x0000ffff8b26d144:  movk    x15, #0xaaaa, lsl #32
   0x0000ffff8b26d148:  blr x15
   0x0000ffff8b26d14c:  b   0xffff8b26ce44
   0x0000ffff8b26d150:  adrp    x8, 0xffff9b742000 <zend_jit_leave_top_func_helper>
   0x0000ffff8b26d154:  add x8, x8, #0xf0
   0x0000ffff8b26d158:  blr x8
   0x0000ffff8b26d15c:  b   0xffff8b26cec4
   0x0000ffff8b26d160:  b   0xffff8b164f00
   0x0000ffff8b26d164:  b   0xffff8b164f04
   0x0000ffff8b26d168:  b   0xffff8b164f08
End of assembler dump.

(gdb) zbacktrace 
[0xffff9b813d70] Symfony\Component\Config\Definition\ArrayNode->finalizeValue(array(6)[0xffff9b813dc0]) /var/www/vendor/symfony/config/Definition/ArrayNode.php:245 
[0xffff9b813c80] Symfony\Component\Config\Definition\BaseNode->finalize(array(3)[0xffff9b813cd0]) /var/www/vendor/symfony/config/Definition/BaseNode.php:391 
[0xffff9b813bb0] Symfony\Component\Config\Definition\ArrayNode->finalizeValue(array(39)[0xffff9b813c00]) /var/www/vendor/symfony/config/Definition/ArrayNode.php:245 
[0xffff9b813ac0] Symfony\Component\Config\Definition\BaseNode->finalize(array(18)[0xffff9b813b10]) /var/www/vendor/symfony/config/Definition/BaseNode.php:391 
[0xffff9b813a10] Symfony\Component\Config\Definition\Processor->process(object[0xffff9b813a60], array(15)[0xffff9b813a70]) /var/www/vendor/symfony/config/Definition/Processor.php:36 
[0xffff9b813990] Symfony\Component\Config\Definition\Processor->processConfiguration(object[0xffff9b8139e0], array(15)[0xffff9b8139f0]) 
/var/www/vendor/symfony/config/Definition/Processor.php:46 
[0xffff9b8138e0] Symfony\Component\DependencyInjection\Extension\Extension->processConfiguration(object[0xffff9b813930], array(15)[0xffff9b813940]) 
/var/www/vendor/symfony/dependency-injection/Extension/Extension.php:109 
[0xffff9b813800] Symfony\Bundle\FrameworkBundle\DependencyInjection\FrameworkExtension->load(array(15)[0xffff9b813850], object[0xffff9b813860]) 
/var/www/vendor/symfony/framework-bundle/DependencyInjection/FrameworkExtension.php:304 
[0xffff9b813690] Symfony\Component\DependencyInjection\Compiler\MergeExtensionConfigurationPass->process(object[0xffff9b8136e0]) 
/var/www/vendor/symfony/dependency-injection/Compiler/MergeExtensionConfigurationPass.php:76 
[0xffff9b8135f0] Symfony\Component\HttpKernel\DependencyInjection\MergeExtensionConfigurationPass->process(object[0xffff9b813640]) 
/var/www/vendor/symfony/http-kernel/DependencyInjection/MergeExtensionConfigurationPass.php:42 
[0xffff9b8134f0] Symfony\Component\DependencyInjection\Compiler\Compiler->compile(object[0xffff9b813540]) /var/www/vendor/symfony/dependency-injection/Compiler/Compiler.php:73 
[0xffff9b813410] Symfony\Component\DependencyInjection\ContainerBuilder->compile() /var/www/vendor/symfony/dependency-injection/ContainerBuilder.php:716 
[0xffff9b813250] Symfony\Component\HttpKernel\Kernel->initializeContainer() /var/www/vendor/symfony/http-kernel/Kernel.php:538 
[0xffff9b8131b0] Symfony\Component\HttpKernel\Kernel->preBoot() /var/www/vendor/symfony/http-kernel/Kernel.php:767 
[0xffff9b8130f0] Symfony\Component\HttpKernel\Kernel->handle(object[0xffff9b813140]) /var/www/vendor/symfony/http-kernel/Kernel.php:190 
[0xffff9b813020] (main) /var/www/public/index.php:32 
dstogov commented 2 years ago

@javer thanks for the help. @oleg-st thanks for the check if this is related to inheritance cache (I also suspected it).

Unfortunately, I can't find the reason of the crash yet. The backtraces are very different. Can you please check if the failures occurs after opcache restart. (See *_restats returned by opcache_get_status(false)). Also please check if you have any fatal PHP errors (especially memory overflow or execution timeout).

I tried your reproduction method with deleting cache on symfony_demo app and PHP-8.1 HEAD. I didn't see any crashes or assertions. Probably, the failures occur only with some specific PHP code.

It would be great, if you could check PHP-8.1 branch HEAD as well.

javer commented 2 years ago

@dstogov Thank you, it seems your assumption is right - the crash (iteration 15 in the table) occurs right after opcache restart due to OOM. Also I've noticed very high level of constantly increasing memory_usage.wasted_memory and very low level of interned_strings_usage.free_memory.

In the table you can see opcache stats at the start of each iteration, the crash occurs on 15th iteration, and cache is cleared before each iteration marked as (cc).

Stats \ Iteration 1 (cc) 2 3 (cc) 4 13 (cc) 14 15 (cc)
memory_usage: used_memory 11625256 136408088 150915672 152271400 199839816 199158272 11625256
memory_usage: free_memory 256810200 132027368 117519784 111336320 12890392 6706928 256810200
memory_usage: wasted_memory 0 0 0 4827736 55705248 62570256 0
memory_usage: current_wasted_percentage 0.0 0.0 117519784 1.798471 20.751821 23.309236 0.0
interned_strings_usage: used_memory 448536 6290984 6290984 6290984 6290984 6290984 448536
interned_strings_usage: free_memory 5842456 8 8 8 8 8 5842456
interned_strings_usage: number_of_strings 9406 65023 65023 65023 65023 65023 9406
opcache_statistics: num_cached_scripts 1 12247 12487 12487 12495 12495 1
opcache_statistics: num_cached_keys 2 17384 17688 17688 17696 17696 2
opcache_statistics: hits 0 1 3602 10103 49330 55831 0
opcache_statistics: misses 1 12508 12999 24985 84986 96972 1
opcache_statistics: opcache_hit_rate 0.0 0.007994 21.697488 28.793319 36.726823 36.537895 0.0
opcache_statistics: last_restart_time 0 0 0 0 0 0 1664385191
opcache_statistics: oom_restarts 0 0 0 0 0 0 1

Full dumps of opcache_get_status() before each iteration can be found here: https://gist.github.com/javer/f2a0cc84dc548785ff9a0f2949d461b1

Also I've checked PHP-8.1 HEAD and the crash is still here.

mirzazeyrek commented 2 years ago

Is it possible to share a code example for those who can not reproduce the issue ?

javer commented 2 years ago

I cannot share the whole project, because it's closed-source commercial project, but it seems I finally created a small reproducer: https://github.com/javer/php-issue-7817

Please note that it's very flaky and is not guaranteed to crash, because it's fake small subset of the real project. But on my machine (ARM64) it crashes in about 5 minutes after ~2030 requests following the instructions from the readme. Unfortunately it doesn't crash during native run on the same machine, and it also doesn't crash in docker on AMD64. If it doesn't crash on your machine - try in another time of the day, because it crashes for me in the morning and in the evening, but not in the middle of the day, and I'm not joking.

Please do not run anything related to composer, I intentionally committed vendor folder, because almost any change in the source code removes the crash. So just follow readme and you should get the crash.

dstogov commented 2 years ago

@javer I wasn't able to reproduce the crash on AMD64 (docker and native). Attempt to reproduce this on ARM emulator would take ages :(

dstogov commented 2 years ago

@javer does your app crashes well on both ARM and Intel? or only on ARM?

javer commented 2 years ago

@dstogov On Intel it crashes as well on production servers, and it crashed several times just after the deploy, but it might be connected also with the cache warmup, because not all cache can be warmed up during the deploy process. ARM is my development machine (Macbook on M1 Pro), and sometimes it crashes just after starting phpunit with an empty cache, and neither cache clear nor composer install can resolve this, I need only wait a little bit. It looks like that the crash is somehow connected with the memory addresses which were given by OS to the process, maybe there is some overflow during memory address calculation in edge cases.

On the whole app php crashes very often on my developers machine, but unfortunately I cannot share the source code. But I can give you any information from the gdb as well as patch anything in php source and try again, just tell me what to do :) I understand that it's not very productive, but that's what we have.

Also I've noticed that changing opcache.memory_consumption a little bit up or down or even adding/removing any class can remove the crash, so it looks like some edge case of fitting some data into the available memory when it should fit but actually goes out of bounds, or something like that.

dstogov commented 2 years ago

@javer I think, it makes sense to try using opcache_reset() after a few requests, instead of waiting for memory overflow. This way, we recently found and fixed a problem (see 3a46f9fd1d03438a41d80251310184182e3b9b27).

Unfortunately, I can't imagine the reason of this crash yet. It might be because of some uninitialized data or dangling pointers that relive opcache reset. Locally I tried to override the freed SHM with different values, but without success (no failures).

May be compiling PHP with -fsanitize=undefined,address,alignment could show some problems.

Gwemox commented 2 years ago

After enabling JIT (PHP 8.1.11), if I do an opcache_reset(), I get a SIGSEGV or SIGABRT error on every request. The same problem occurs after a few hours of execution without opcache_reset.

I am running a Symfony project with Api-Platform. Disabling JIT solves the problem.

PHP-FPM logs:

[06-Oct-2022 13:41:21] NOTICE: PHP message: PHP Warning:  Can't preload already declared class ReflectionEnum in /var/www/app/vendor/laminas/laminas-code/polyfill/ReflectionEnumPolyfill.php on line 8
[06-Oct-2022 13:41:21] NOTICE: PHP message: PHP Warning:  Can't preload unlinked class Vich\UploaderBundle\Form\Type\VichFileType: Unknown parent Symfony\Component\Form\AbstractType in /var/www/app/vendor/vich/uploader-bundle/src/Form/Type/VichFileType.php on line 27
[06-Oct-2022 13:41:21] NOTICE: PHP message: PHP Warning:  Can't preload unlinked class Vich\UploaderBundle\Form\Type\VichImageType: Unknown parent Vich\UploaderBundle\Form\Type\VichFileType in /var/www/app/vendor/vich/uploader-bundle/src/Form/Type/VichImageType.php on line 21
[06-Oct-2022 13:41:21] NOTICE: PHP message: PHP Warning:  Can't preload unlinked class Vich\UploaderBundle\Form\Type\VichFileType: Unknown parent Symfony\Component\Form\AbstractType in /var/www/app/vendor/vich/uploader-bundle/src/Form/Type/VichFileType.php on line 27
[06-Oct-2022 13:41:22] NOTICE: fpm is running, pid 1
[06-Oct-2022 13:41:22] NOTICE: ready to handle connections
10.0.26.8 -  06/Oct/2022:13:44:41 +0000 "GET /index.php" 200
10.0.26.8 -  06/Oct/2022:13:44:50 +0000 "GET /index.php" 200
[Thu Oct  6 13:44:50 2022]  Script:  '-'
/usr/src/php/Zend/zend_string.h(150) :  Freeing 0x00007fbb3d261660 (56 bytes), script=-
=== Total 1 memory leaks detected ===
10.0.26.8 -  06/Oct/2022:13:44:53 +0000 "GET /debug_cache_clear.php" 200
Assertion failed: 0 (ext/opcache/jit/zend_jit_trace.c: zend_jit_find_trace: 259)
[06-Oct-2022 13:44:59] WARNING: [pool www] child 9 exited on signal 6 (SIGABRT - core dumped) after 216.739847 seconds from start
[06-Oct-2022 13:44:59] NOTICE: [pool www] child 22 started
Assertion failed: 0 (ext/opcache/jit/zend_jit_trace.c: zend_jit_find_trace: 259)
[06-Oct-2022 13:45:52] WARNING: [pool www] child 8 exited on signal 6 (SIGABRT - core dumped) after 269.261886 seconds from start
[06-Oct-2022 13:45:52] NOTICE: [pool www] child 37 started
[06-Oct-2022 13:46:21] WARNING: [pool www] child 22 exited on signal 11 (SIGSEGV - core dumped) after 82.127272 seconds from start
[06-Oct-2022 13:46:21] NOTICE: [pool www] child 52 started

SIGSEGV:

/var/www/app # gdb /usr/local/sbin/php-fpm /var/crash/core-php-fpm.22
GNU gdb (GDB) 11.2
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-alpine-linux-musl".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/sbin/php-fpm...

warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing
[New LWP 22]
Core was generated by `php-fpm: po'.
--Type <RET> for more, q to quit, c to continue without paging--
Program terminated with signal SIGSEGV, Segmentation fault.
#0  zend_jit_var_may_alias (op_array=0x415e7d60, ssa=0x437cb7b0, var=0) at ext/opcache/jit/zend_jit_trace.c:396
396     ext/opcache/jit/zend_jit_trace.c: No such file or directory.
(gdb) 

SIGABRT:

/var/www/app # gdb /usr/local/sbin/php-fpm /var/crash/core-php-fpm.9
GNU gdb (GDB) 11.2
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-alpine-linux-musl".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/sbin/php-fpm...

warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing
[New LWP 9]
Core was generated by `php-fpm: po'.
--Type <RET> for more, q to quit, c to continue without paging--
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fbb3de6a3fa in setjmp () from /lib/ld-musl-x86_64.so.1
(gdb) 
nickdnk commented 2 years ago

I have run into some very similar issues here, on PHP 8.1.9. I did not have any of these problems on 8.0.x. This is on ARM on Amazon Linux.

I'm getting random errors like these from /var/log/php-fpm.log

[06-Oct-2022 11:32:27] WARNING: [pool www] child 17180 exited on signal 11 (SIGSEGV) after 83771.246014 seconds from start
[06-Oct-2022 11:32:27] NOTICE: [pool www] child 1075 started
[06-Oct-2022 11:32:42] WARNING: [pool www] child 7144 exited on signal 11 (SIGSEGV) after 199312.802667 seconds from start
[06-Oct-2022 11:32:42] NOTICE: [pool www] child 1078 started
[06-Oct-2022 11:35:14] WARNING: [pool www] child 1075 exited on signal 11 (SIGSEGV) after 166.362124 seconds from start
[06-Oct-2022 11:35:14] NOTICE: [pool www] child 1110 started
[06-Oct-2022 11:36:12] WARNING: [pool www] child 7414 exited on signal 11 (SIGSEGV) after 198365.723652 seconds from start
[06-Oct-2022 11:36:12] NOTICE: [pool www] child 1118 started
[06-Oct-2022 13:39:56] WARNING: [pool www] child 1118 exited on signal 11 (SIGSEGV) after 7424.239427 seconds from start
[06-Oct-2022 13:39:56] NOTICE: [pool www] child 2516 started
[06-Oct-2022 13:40:13] WARNING: [pool www] child 1110 exited on signal 11 (SIGSEGV) after 7499.643656 seconds from start
[06-Oct-2022 13:40:13] NOTICE: [pool www] child 2576 started
[06-Oct-2022 13:41:24] WARNING: [pool www] child 1078 exited on signal 11 (SIGSEGV) after 7722.033125 seconds from start
[06-Oct-2022 13:41:24] NOTICE: [pool www] child 2586 started
[06-Oct-2022 13:47:01] WARNING: [pool www] child 7145 exited on signal 11 (SIGSEGV) after 207371.678102 seconds from start
[06-Oct-2022 13:47:01] NOTICE: [pool www] child 2727 started
[06-Oct-2022 13:47:28] WARNING: [pool www] child 7148 exited on signal 11 (SIGSEGV) after 207399.170871 seconds from start

Paired with errors like this from /var/log/nginx/error.log:

2022/10/06 11:32:27 [error] 7179#7179: *57166 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: ip.ip.ip.ip, server: domain.tld, request: "GET /api/only-this-path-affected HTTP/1.1", upstream: "fastcgi://unix:/run/php-fpm/www.sock:", host: "domain.tld", referrer: "https://domain.tld"

(sanitized for privacy)

The weird thing is that for this case, it's only this particular path that's affected - all other paths in the API seem to work fine, and I cannot reliably reproduce this issue at all. Sometimes it seems that all paths are affected, however. It's totally random and probably related to opcache. I run it with these parameters dumped from phpinfo():

opcache.blacklist_filename | no value | no value
opcache.consistency_checks | 0 | 0
opcache.dups_fix | Off | Off
opcache.enable | On | On
opcache.enable_cli | Off | Off
opcache.enable_file_override | Off | Off
opcache.error_log | no value | no value
opcache.file_cache | no value | no value
opcache.file_cache_consistency_checks | On | On
opcache.file_cache_only | Off | Off
opcache.file_update_protection | 2 | 2
opcache.force_restart_timeout | 180 | 180
opcache.huge_code_pages | Off | Off
opcache.interned_strings_buffer | 8 | 8
opcache.jit | tracing | tracing
opcache.jit_bisect_limit | 0 | 0
opcache.jit_blacklist_root_trace | 16 | 16
opcache.jit_blacklist_side_trace | 8 | 8
opcache.jit_buffer_size | 128M | 128M
opcache.jit_debug | 0 | 0
opcache.jit_hot_func | 127 | 127
opcache.jit_hot_loop | 64 | 64
opcache.jit_hot_return | 8 | 8
opcache.jit_hot_side_exit | 8 | 8
opcache.jit_max_exit_counters | 8192 | 8192
opcache.jit_max_loop_unrolls | 8 | 8
opcache.jit_max_polymorphic_calls | 2 | 2
opcache.jit_max_recursive_calls | 2 | 2
opcache.jit_max_recursive_returns | 2 | 2
opcache.jit_max_root_traces | 1024 | 1024
opcache.jit_max_side_traces | 128 | 128
opcache.jit_prof_threshold | 0.005 | 0.005
opcache.lockfile_path | /tmp | /tmp
opcache.log_verbosity_level | 1 | 1
opcache.max_accelerated_files | 10000 | 10000
opcache.max_file_size | 0 | 0
opcache.max_wasted_percentage | 5 | 5
opcache.memory_consumption | 128 | 128
opcache.opt_debug_level | 0 | 0
opcache.optimization_level | 0x7FFEBFFF | 0x7FFEBFFF
opcache.preferred_memory_model | no value | no value
opcache.preload | no value | no value
opcache.preload_user | no value | no value
opcache.protect_memory | Off | Off
opcache.record_warnings | Off | Off
opcache.restrict_api | no value | no value
opcache.revalidate_freq | 2 | 2
opcache.revalidate_path | Off | Off
opcache.save_comments | On | On
opcache.use_cwd | On | On
opcache.validate_permission | Off | Off
opcache.validate_root | Off | Off
opcache.validate_timestamps | Off | Off

Output from php-fpm -tt looks like this:

[06-Oct-2022 14:23:53] NOTICE: [global]
[06-Oct-2022 14:23:53] NOTICE:  pid = undefined
[06-Oct-2022 14:23:53] NOTICE:  error_log = //var/log/php-fpm.log
[06-Oct-2022 14:23:53] NOTICE:  syslog.ident = php-fpm
[06-Oct-2022 14:23:53] NOTICE:  syslog.facility = 24
[06-Oct-2022 14:23:53] NOTICE:  log_buffering = yes
[06-Oct-2022 14:23:53] NOTICE:  log_level = unknown value
[06-Oct-2022 14:23:53] NOTICE:  log_limit = 1024
[06-Oct-2022 14:23:53] NOTICE:  emergency_restart_interval = 60s
[06-Oct-2022 14:23:53] NOTICE:  emergency_restart_threshold = 5
[06-Oct-2022 14:23:53] NOTICE:  process_control_timeout = 0s
[06-Oct-2022 14:23:53] NOTICE:  process.max = 0
[06-Oct-2022 14:23:53] NOTICE:  process.priority = undefined
[06-Oct-2022 14:23:53] NOTICE:  daemonize = yes
[06-Oct-2022 14:23:53] NOTICE:  rlimit_files = 0
[06-Oct-2022 14:23:53] NOTICE:  rlimit_core = 0
[06-Oct-2022 14:23:53] NOTICE:  events.mechanism = epoll
[06-Oct-2022 14:23:53] NOTICE:  
[06-Oct-2022 14:23:53] NOTICE: [www]
[06-Oct-2022 14:23:53] NOTICE:  prefix = undefined
[06-Oct-2022 14:23:53] NOTICE:  user = webapp
[06-Oct-2022 14:23:53] NOTICE:  group = webapp
[06-Oct-2022 14:23:53] NOTICE:  listen = /run/php-fpm/www.sock
[06-Oct-2022 14:23:53] NOTICE:  listen.backlog = 511
[06-Oct-2022 14:23:53] NOTICE:  listen.acl_users = apache,nginx
[06-Oct-2022 14:23:53] NOTICE:  listen.acl_groups = undefined
[06-Oct-2022 14:23:53] NOTICE:  listen.owner = undefined
[06-Oct-2022 14:23:53] NOTICE:  listen.group = undefined
[06-Oct-2022 14:23:53] NOTICE:  listen.mode = undefined
[06-Oct-2022 14:23:53] NOTICE:  listen.allowed_clients = 127.0.0.1
[06-Oct-2022 14:23:53] NOTICE:  process.priority = undefined
[06-Oct-2022 14:23:53] NOTICE:  process.dumpable = no
[06-Oct-2022 14:23:53] NOTICE:  pm = dynamic
[06-Oct-2022 14:23:53] NOTICE:  pm.max_children = 50
[06-Oct-2022 14:23:53] NOTICE:  pm.start_servers = 5
[06-Oct-2022 14:23:53] NOTICE:  pm.min_spare_servers = 5
[06-Oct-2022 14:23:53] NOTICE:  pm.max_spare_servers = 35
[06-Oct-2022 14:23:53] NOTICE:  pm.max_spawn_rate = 32
[06-Oct-2022 14:23:53] NOTICE:  pm.process_idle_timeout = 10
[06-Oct-2022 14:23:53] NOTICE:  pm.max_requests = 0
[06-Oct-2022 14:23:53] NOTICE:  pm.status_path = undefined
[06-Oct-2022 14:23:53] NOTICE:  pm.status_listen = undefined
[06-Oct-2022 14:23:53] NOTICE:  ping.path = undefined
[06-Oct-2022 14:23:53] NOTICE:  ping.response = undefined
[06-Oct-2022 14:23:53] NOTICE:  access.log = undefined
[06-Oct-2022 14:23:53] NOTICE:  access.format = undefined
[06-Oct-2022 14:23:53] NOTICE:  slowlog = /var/log/php-fpm/www-slow.log
[06-Oct-2022 14:23:53] NOTICE:  request_slowlog_timeout = 0s
[06-Oct-2022 14:23:53] NOTICE:  request_slowlog_trace_depth = 20
[06-Oct-2022 14:23:53] NOTICE:  request_terminate_timeout = 0s
[06-Oct-2022 14:23:53] NOTICE:  request_terminate_timeout_track_finished = no
[06-Oct-2022 14:23:53] NOTICE:  rlimit_files = 29278
[06-Oct-2022 14:23:53] NOTICE:  rlimit_core = 0
[06-Oct-2022 14:23:53] NOTICE:  chroot = undefined
[06-Oct-2022 14:23:53] NOTICE:  chdir = undefined
[06-Oct-2022 14:23:53] NOTICE:  catch_workers_output = no
[06-Oct-2022 14:23:53] NOTICE:  decorate_workers_output = yes
[06-Oct-2022 14:23:53] NOTICE:  clear_env = no
[06-Oct-2022 14:23:53] NOTICE:  security.limit_extensions = .php .phar
[06-Oct-2022 14:23:53] NOTICE:  php_admin_value[log_errors] = 1
[06-Oct-2022 14:23:53] NOTICE:  php_admin_value[error_log] = /var/log/php-fpm/www-error.log
[06-Oct-2022 14:23:53] NOTICE:  php_admin_value[memory_limit] = 128M
[06-Oct-2022 14:23:53] NOTICE:  
[06-Oct-2022 14:23:53] NOTICE: configuration file //etc/php-fpm.conf test is successful

I have the APCu extension installed, version 5.1.22. I don't know how to provide more debugging info than this, unfortunately. I will try disabling JIT to see if that helps.

javer commented 2 years ago

@dstogov I've sent to your email address the access to AWS EC2 instance on ARM CPU and Debian 11 where the crash reproduces. See details in the email message. I hope it will help.

dstogov commented 2 years ago

Thanks @javer

This helped to find and fix a JIT related problem (see c5364b851a9e23af1d3be49651993540424c5db2). The fixed bug is ARM specific and most probably we have another hard-reproducible problem.

Gwemox commented 2 years ago

I reproduce the SIGSEGV problem after an opcache_reset() in a docker on an AMD64 architecture. However, this is non-public code. I will try to make a public version.

Gwemox commented 2 years ago

@dstogov I can reproduce this with my example : https://github.com/Gwemox/php_issue_7817 on AMD64

web_1    | [12-Oct-2022 17:42:31] NOTICE: PHP message: PHP Warning:  Can't preload already declared class ReflectionEnum in /var/www/app/vendor/laminas/laminas-code/polyfill/ReflectionEnumPolyfill.php on line 8
web_1    | [12-Oct-2022 17:42:32] NOTICE: fpm is running, pid 1
web_1    | [12-Oct-2022 17:42:32] NOTICE: ready to handle connections
web_1    | 127.0.0.1 -  12/Oct/2022:17:42:32 +0000 "GET /index.php" 404
web_1    | 127.0.0.1 -  12/Oct/2022:17:42:35 +0000 "GET /debug_cache_clear.php" 200
web_1    | Assertion failed: ((execute_data)->opline) >= ((execute_data)->func)->op_array.opcodes && ((execute_data)->opline) < ((execute_data)->func)->op_array.opcodes + ((execute_data)->func)->op_array.last (./jit/zend_jit_trace.c: zend_jit_trace_exit: 8067)
web_1    | [12-Oct-2022 17:42:36] WARNING: [pool www] child 38 exited on signal 6 (SIGABRT - core dumped) after 4.697921 seconds from start
web_1    | [12-Oct-2022 17:42:36] NOTICE: [pool www] child 40 started
mirzazeyrek commented 2 years ago

https://github.com/php/php-src/issues/9746

iluuu1994 commented 2 years ago

Doesn't seem related. The CI failure is caused by 625f1649639c2b9a9d76e4d42f88c264ddb8447d which is only merged into PHP 8.2+.

dstogov commented 2 years ago

@dstogov I can reproduce this with my example : https://github.com/Gwemox/php_issue_7817 on AMD64

web_1    | [12-Oct-2022 17:42:31] NOTICE: PHP message: PHP Warning:  Can't preload already declared class ReflectionEnum in /var/www/app/vendor/laminas/laminas-code/polyfill/ReflectionEnumPolyfill.php on line 8
web_1    | [12-Oct-2022 17:42:32] NOTICE: fpm is running, pid 1
web_1    | [12-Oct-2022 17:42:32] NOTICE: ready to handle connections
web_1    | 127.0.0.1 -  12/Oct/2022:17:42:32 +0000 "GET /index.php" 404
web_1    | 127.0.0.1 -  12/Oct/2022:17:42:35 +0000 "GET /debug_cache_clear.php" 200
web_1    | Assertion failed: ((execute_data)->opline) >= ((execute_data)->func)->op_array.opcodes && ((execute_data)->opline) < ((execute_data)->func)->op_array.opcodes + ((execute_data)->func)->op_array.last (./jit/zend_jit_trace.c: zend_jit_trace_exit: 8067)
web_1    | [12-Oct-2022 17:42:36] WARNING: [pool www] child 38 exited on signal 6 (SIGABRT - core dumped) after 4.697921 seconds from start
web_1    | [12-Oct-2022 17:42:36] NOTICE: [pool www] child 40 started

Thanks @Gwemox and @javer. The problem is fixed via 61e563ca4070de1569ecaa261c88721eba17f4c5. It was related to incorrect handling of dynamic preloaded functions during opcache restart. They started handling differently in PHP-8.1 and the corresponding JIT reset code wasn't updated accordingly.

dominikhalvonik commented 2 years ago

Hi @dstogov will this issue be included in release 8.1.12? Thank you

dstogov commented 2 years ago

Hi @dstogov will this issue be included in release 8.1.12? Thank you

I think no. 8.1.2 was already branched from the main PHP-8.1 branch.

cmb69 commented 2 years ago

The fix should go into PHP 8.1.13.

PowerKiKi commented 2 years ago

Given the criticality of the issue, and the simplicity of the patch, can't this exceptionally be backported to 8.1.12 ?

nickdnk commented 2 years ago

Given the criticality of the issue, and the simplicity of the patch, can't this exceptionally be backported to 8.1.12 ?

AWS is notoriously slow to adopt new versions of PHP for Elastic Beanstalk, so getting this in as soon as possible would be good, as JIT simply cannot be deployed with these bugs.

cmb69 commented 2 years ago

Given the criticality of the issue, and the simplicity of the patch, can't this exceptionally be backported to 8.1.12 ?

That would need to be decided by the release managers; @ramsey, @adoy, @krakjoe, what do you think?

dstogov commented 2 years ago

Given the criticality of the issue, and the simplicity of the patch, can't this exceptionally be backported to 8.1.12 ?

That would need to be decided by the release managers; @ramsey, @adoy, @krakjoe, what do you think?

The fix is not dangerous. It changes behaviour only for tracing JIT + preloading + opcache_restart(). But I belie this is not the last JIT related bug.

cmb69 commented 2 years ago

The fix is not dangerous.

Okay, but still shipping with 8.1.11 would require the fix to be ported to the PHP-8.1.11 branch (what's usually done by RMs).

adoy commented 2 years ago

That would need to be decided by the release managers; @ramsey, @adoy, @krakjoe, what do you think?

@patrickallaert

PowerKiKi commented 2 years ago

I submitted PR #9764 so we can at the very least see if CI passes, and hopefully merge into PHP 8.1.12.

audaki commented 2 years ago

We got really burned by JIT failures on a production ecommerce shop and lost income. It seems like the JIT should be marked not production ready (no offense intended!) until all of this is ironed out. It took us some time to find out the random SIGSEGVs in the PHP-FPM were caused by JIT, to disable it and since then 100% stable ecommerce shop again.

AFAICT the fix was not merged into 8.1.12, right?

nickdnk commented 2 years ago

I just had 8.1.9 swap two parameters of a constructor function. If those had been compatible parameters, it likely wouldn't have errored, which could cause all kinds of crazy bugs. I disabled JIT and the problem went away.

gregherrell commented 1 year ago

This thread seems to have a few different issues on it. After puzzling through it as a contributor it appears my issue (from August 2022) and a few others are inheritance_cache related which is not what this thread/issue is addressing.

I looked and I don't see an issue specifically related to that. Does there need to be one created? I realize I do not have a reproduceable project, but I would hate to see it not get addressed going forward as I have abandoned using opcache.

farazive commented 1 year ago

Update: Sorry. Found APM could be the cause instead.


Facing this issue after installing Elastic APM Agent, similar to this thread. Using image php:8.1.12-fpm-alpine3.16. Get the following error and container just exits.

[21-Nov-2022 01:29:40 UTC] PHP Warning:  Can't preload already declared class ReflectionEnum in /var/www/html/cobra/app/vendor/laminas/laminas-code/polyfill/ReflectionEnumPolyfill.php on line 8

<br />

<b>Warning</b>:  Can't preload already declared class ReflectionEnum in <b>/var/www/html/cobra/app/vendor/laminas/laminas-code/polyfill/ReflectionEnumPolyfill.php</b> on line <b>8</b><br />
diabelb commented 1 year ago

Problem still exists in PHP 8.2.1. After setting

opcache.jit = 1205

issue is temporary solved.

cappadaan commented 1 year ago

We are also experiencing this issue again after upgrading to 8.2.1

dstogov commented 1 year ago

The problem can't be analysed and fixed without a reproduction case.

TomZhuPlanetart commented 1 year ago

If you are getting SIGSEGV like me, please check if SELINUX is enabled on your server. I resolved it by running this command.

setsebool httpd_execmem true
[30-Jan-2023 10:32:36] WARNING: [pool www] child 907 exited on signal 11 (SIGSEGV) after 305.509962 seconds from start
[30-Jan-2023 10:32:36] NOTICE: [pool www] child 3487 started
[30-Jan-2023 10:32:37] WARNING: [pool www] child 911 exited on signal 11 (SIGSEGV) after 306.420964 seconds from start
[30-Jan-2023 10:32:37] NOTICE: [pool www] child 3494 started
[30-Jan-2023 10:32:37] WARNING: [pool www] child 914 exited on signal 11 (SIGSEGV) after 306.738487 seconds from start
[30-Jan-2023 10:32:37] NOTICE: [pool www] child 3495 started
[30-Jan-2023 10:32:45] WARNING: [pool www] child 917 exited on signal 11 (SIGSEGV) after 314.666612 seconds from start
[30-Jan-2023 10:32:45] NOTICE: [pool www] child 3546 started
[30-Jan-2023 10:32:46] WARNING: [pool www] child 921 exited on signal 11 (SIGSEGV) after 314.944248 seconds from start
[30-Jan-2023 10:32:46] NOTICE: [pool www] child 3553 started
cappadaan commented 1 year ago

We resolved this by disabling cache in Twig.

Segfaults appear when we follow these steps:

  1. delete twig cache files
  2. call opcache_reset() function
vias79 commented 1 year ago

I think I ran into this bug, Ubuntu 22.04 with php-fpm 2:8.1+92ubuntu1 running LibreNMS.

Fix was to disable php-fpm JIT: nano /etc/php/8.1/fpm/conf.d/10-opcache.ini opcache.enable=0 opcache.enable_cli=0 service php8.1-fpm restart

dominikhalvonik commented 1 year ago

@vias79 sorry but I do not think that turning off the functionality can be considered as "fix" :)

MaxKellermann commented 1 year ago

I do not think that turning off the functionality can be considered as "fix"

Fully agree. But it's the only practical "solution".

I, too, had plenty of crashes with the JIT, and of course they disappeared after I switched off the JIT. We have now permanently disabled the JIT in our shared hosting infrastructure because we figured it's not production ready, not mature enough.

There are two big problems: first, JIT crashes are very hard to reproduce. Of course, it's natural to say "give me a reproduction case", but that ain't going to happen. I can easily trigger a JIT crash within a minute, but still I'm unable to give instructions for others to reproduce my crashes. My theory is that they occur due to data races between threads/processes in their shared memory. To get a grip on such bugs, you need to understand the JIT and the states than can occur when two or more processes/threads work on shared memory.

And that's the second problem: a JIT is always very complex, and in addition to that, PHP's JIT implementation is implemented in a very cryptic way that is probably only understood by its sole author; there are functions with thousands of code lines and countless goto statements (shudder!), and a hell lot of code duplication between the engine and the JIT; every time you fix an engine bug, somebody needs to carry over the fix to the JIT (and the JIT has two copies of everything: one x86_64 and one ARM64).

Anyway, my attempts to simplify the code, to be able to understand it, were thwarted by that very person; he vetoed my efforts.

The reasons he gave for that veto clearly indicated that the PHP 8 JIT will never be fixed; he feared that all changes to the current JIT would make integrating the new JIT that he currently develops harder. This reasoning sounds extremely backwards and harmful to me. That leaves me no room for having faith that the new JIT will be any better.

But, why bother - I ran some WordPress benchmarks and found no measurable performance improvement with the JIT. The only effect was that every PHP process consumed many megabytes of memory, but no performance improvement. It's not the kind of application that would benefit from a JIT, and few PHP applications probably are. The JIT is nice for benchmarks, but not much else.

There are other ways to improve PHP's performance with typical applications, and I have submitted lots of them already (many are part of PHP 8.2, but many got rejected), and I have many other improvements that I have not submitted, and I have many more ideas that will probably be rejected as well. Why care - our PHP fork has all these improvements - but no JIT.

meinemitternacht commented 1 year ago

@MaxKellermann I get your frustration, as our company is also impacted by some of the JIT issues, but using the 1205 flags works for us. That said, I don't think @dstogov was wrong with his response to your code refactor request. He maintains the JIT engine, and it's his call as a core maintainer whether something like that gets included or not. When the next bug comes down the pipe, he would then have to re-interpret your understanding of his original code.

Surely he realizes that the current JIT implementation is not ideal, which is why he is working on a replacement. The JIT was AFAIK understood to be experimental at best, and a "use at your own risk" option. If it doesn't work in your environment, then that is quite unfortunate, but maybe it will work again in the future.

PHP has such a diverse implementation ecosystem and making everyone happy just isn't going to happen. I hope you find a solution to the problems you outlined, and don't let these experiences discourage you from contributing to the project in the future.

Disclaimer: I am just an outsider looking in, but I hate to see such negative discourse over a hard to fix issue.

PowerKiKi commented 1 year ago

The JIT was AFAIK understood to be experimental at best, and a "use at your own risk" option

Unfortunately I don't see anything suggesting that in the official docs. Moreover it is enabled by default (at least on Debian/Ubuntu), with the problematic flag opcache.jit => tracing flag. I wouldn't expect something enabled by default to be experimental. I am surprised that the JIT author himself says that not only it is buggy and won't be fixed for months (because busy with new version), but also that it is "often almost useless for real-life apps".

If something is experimental, proven to have complex critical issues that cannot be resolved in the short term, and that even if fixed would not help most people that much, then the least I'd expect is to be disabled it by default.

meinemitternacht commented 1 year ago

Moreover it is enabled by default (at least on Debian/Ubuntu), with the problematic flag opcache.jit => tracing flag.

The default for php.ini has opcache.jit_buffer_size = "0" which disables JIT. Is it different for the distro config file?

blacktek commented 1 year ago

Hello, crashing here with

PHP 8.1.14 (cli) (built: Jan 13 2023 10:43:50) (NTS) Copyright (c) The PHP Group Zend Engine v4.1.14, Copyright (c) Zend Technologies with Zend OPcache v8.1.14, Copyright (c), by Zend Technologies

and auto_globals_jit = On opcache.jit_buffer_size=512M opcache.jit=1235

NOT crashing with opcache.jit=1205

Gwemox commented 1 year ago

@blacktek can you provide an example to reproduce?

blacktek commented 1 year ago

@blacktek can you provide an example to reproduce?

unfortunately not, but I confirm that by using opcache.jit=tracing (that should be 1254, if I'm not wrong) it works too. Now running like that.