cappadaan commented 2 years ago

Description

PHP 8.1.0 + 8.1.1 produces segfault, randomly. Downgrading to 8.0 solves the issue.

--core dump---

BFD: Warning: coredump-php-fpm.30267 is truncated: expected core file size >= 5413076992, found: 35983360. [New LWP 30267] [New LWP 1887] [New LWP 1886] [New LWP 1888] Cannot access memory at address 0x7f277dbb3128 Cannot access memory at address 0x7f277dbb3120 Failed to read a valid object file image from memory. Core was generated by `php-fpm: pool xxxxxx '.

Program terminated with signal 11, Segmentation fault.

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

10137 ce = CACHED_PTR(opline->op2.num); (gdb) bt Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x7ffdc940e3a8: (gdb) bt

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

Cannot access memory at address 0x7ffdc940e3a8 (gdb) frame 0

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

10137 ce = CACHED_PTR(opline->op2.num); (gdb) info frame Stack level 0, frame at 0x7ffdc940e3b0: rip = 0x55bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER (/usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137); saved rip Cannot access memory at address 0x7ffdc940e3a8

this is the only available info in the core dump.

PHP Version

PHP 8.1.0 + 8.1.1

Operating System

CentOS 7

kohlerdominik commented 2 years ago

Hi @dstogov

The issue in my environment (Docker Container) is 100% reproducable. It's occuring on my local docker setup as well as in our GCP K8S environment. It might even apear outside of docker containers.

But our environment is hard to setup. So the best I could offer you is a remote VM with docker-compose and guide how to reproduce (the issue does only happen on certain endpoints unfortunately). And I guess inside of an alpine docker-container is something between "not the prefered way" and "impossible" to debug JIT issues...?

Huggyduggy commented 2 years ago

We're running 8.1.2 within AWS ECS on Fargate, thus new hosts are provisioned for each deployment. In ~ 4/5 deployments, we se multiple SIGSEGVs within the first 60 seconds, leading to a deploy-and-kill loop, which happens a few time until we happen to get a steady server. We're currently trying our luck with JIT 1205, I'll update this comment once we have some experience with it.

(For the records, we're also running on laravel 8.x)

meinemitternacht commented 2 years ago

@Huggyduggy Do you completely destroy the instances each time, or do you only deploy your codebase when you release fixes and such? I was wondering if you could provide some insight as to the behavior when you update PHP files on the filesystem without restarting the PHP-FPM instance.

Huggyduggy commented 2 years ago

@meinemitternacht During deployment, a bunch of new virtual AWS EC2 servers are started, they're provisioned with the latest container-software as provided by AWS. Then, containers are pulled & started automatically. Legacy servers/containers get removed from the LB and terminate. There's no real way of changing the codebase on existing containers / servers, I'm afraid.

We're developing on the same PHP-Docker images which we also use for production releases, during development I've not yet noticed any segfaults.

zejji commented 2 years ago

Having just wasted 30 hours of my life debugging segfaults which occurred immediately after new deployments of PHP containers (running a large CakePHP 4.2 application) on Docker Swarm, I can confirm that this is definitely an issue.

We are using PHP 8.1.3 via the Bitnami PHP-FPM Docker image.

The solution to the segfaults in our case was disabling JIT and setting the JIT buffer size to zero:

opcache.jit_buffer_size=0M
opcache.jit=disable

By way of additional background, I was able to consistently reproduce the errors when JIT was enabled by using the Locust load testing utility to hit the server with up to 100 requests a second during the deployment period. After disabling the JIT we are able to achieve zero-downtime deployment, which was not possible when JIT was enabled.

The kind of errors we were seeing when JIT was enabled were as follows. The occurs all occurred within the first minute after deployment:

WARNING: [pool www] child 37 exited on signal 11 (SIGSEGV - core dumped) after 3.219271 seconds from start
WARNING: [pool www] child 46 exited on signal 11 (SIGSEGV - core dumped) after 3.461960 seconds from start
WARNING: [pool www] child 150 exited on signal 11 (SIGSEGV - core dumped) after 2.930645 seconds from start

dominikhalvonik commented 2 years ago

Guys, any update on this? As @zejji said, this issue is still present on 8.1.3 FPM. Is there any progress on this?

meinemitternacht commented 2 years ago

I had difficulties finding a minimal test case so @dstogov could debug the problem. Perhaps others will have more luck?

kvas-damian commented 2 years ago

We had a similar problem after upgrade to PHP 8.1.3 - before we used PHP 8.0.3 without problems.

In our case, Laravel8 app based on PHP-FPM is hosted in Kubernetes. K8s cluster is based on N2D instances with 3rd Gen AMD EPYC processors. Dockerfile which started the problem for us is following:

FROM composer:2.1.12 AS php-composer

FROM php:8.1.3-fpm-alpine3.15

USER root

RUN apk --no-cache add --virtual .build-deps \
  build-base \
  && apk --no-cache add libpng-dev libzip-dev libjpeg-turbo-dev freetype-dev \
  && docker-php-ext-configure gd \
    --with-freetype \
    --with-jpeg \
  && docker-php-ext-install -j$(nproc) bcmath gd zip mysqli pdo_mysql sockets \
  && docker-php-ext-enable opcache \
  && echo $'zend_extension=opcache\n\
[opcache]\n\
opcache.enable=1\n\
opcache.enable_cli=1\n\
opcache.validate_timestamps=0\n\
opcache.max_accelerated_files=10000\n\
opcache.memory_consumption=128\n\
opcache.max_wasted_percentage=10\n\
opcache.interned_strings_buffer=16\n\
opcache.fast_shutdown=1\n\
opcache.jit_buffer_size=100M' > /usr/local/etc/php/conf.d/docker-php-ext-opcache.ini \
  && apk del .build-deps

# copy composer from the first stage
COPY --from=php-composer /usr/bin/composer /usr/bin
RUN composer --ansi --version --no-interaction; php -v; php -m; php -r 'var_export(gd_info()); echo PHP_EOL; var_export(opcache_get_status());'

From ~20 identical pods, only 2 were segfaulting, for all requests. We found following entries in our logs:

[4539894.358435] php-fpm[3056477]: segfault at 8 ip 0000000048d82054 sp 00007ffe99621df0 error 4 in zero (deleted)[48d19000+6400000]\r\n

WARNING: [pool www] child 792 exited on signal 11 (SIGSEGV) after 15.030120 seconds from start

Change from JIT tracing to function (opcache.jit=1205) solved the problem.

I hope it helps you find the root cause.

haad commented 2 years ago

I cannot fix this before I get a way to reproduce the crash.

We have a simple way for reproducing this issue on our internal application/kubernetes. We can provide more details outside github.

dstogov commented 2 years ago

@haad I sent a private email

dominikhalvonik commented 2 years ago

@dstogov if you tell me what info you need I am more than happy to give you all information that I can provide.

dstogov commented 2 years ago

@dominikhalvonik Ideally, I need to reproduce the problem in my debug environment. I may start analyses installing your app (git clone, composer install, your instruction to reproduce), using VM image, or SSH access to test environment.

kohlerdominik commented 2 years ago

Was this issue resolved? When will the fix be released?

Gwemox commented 2 years ago

I think that I have the same issue after upgrading to 8.1 (8.1.3-4) from 8.0.16 with JIT enabled opcache.jit=1255. (PHP-FPM, Alpine, Kubernetes)

It happens randomly, after a segfault all requests fail. If I do an opcache_reset() everything works again.

Symfony 5.4.3 and Api Platform 2.6.8, I don't use PHP-DI.

@dstogov Why is this PR resolved ?

chelsEg commented 2 years ago

@Gwemox I think because it's very hard to reproduce...

But I have the same issue after upgrading to 8.1.

Change opcache.jit=1205 solved the problem!

meinemitternacht commented 2 years ago

I don't think it should be closed just because it is hard to reproduce, if that is indeed the reason.

meinemitternacht commented 2 years ago

If this is going to be closed, I think the default parameter for opcache.jit should be changed to function so that we lessen the impact of this bug.

iluuu1994 commented 2 years ago

I think this might've just been closed by accident.

dominikhalvonik commented 2 years ago

Hi guys, I know this might be a silly question but what is better from a performance point of view:

php:8.0.15-fpm + JIT in config opcache.jit=1255 OR php:8.1.4-fpm + JIT in config opcache.jit=1205

The reason why I am asking is that it is better to remain on PHP 8.0 with 1255 config or we can get better performance if we migrate to PHP 8.1 with 1205. Any ideas?

dstogov commented 2 years ago

@dominikhalvonik JIT performance depends on application. Tracing JIT (1255) should be faster than function (1205). But you should measure the performance of your app yourself. In case JIT gives less than 10% improvement (this is very probable), I would disable it at all. The new version of PHP usually come with new features, fixes and unrelated to JIT performance improvements. So, for some apps, 8.1 with JIT disabled might be better than 8.0 with tracing JIT.

jirkace commented 2 years ago

We have upgraded our cluster (36 server for one app) in last days and I don't see no performance change (request time or CPU usage) between PHP 8.0 with 1255 and PHP 8.1 with 1205... BUT - when we tried to turn opcache off completely, CPU usage incresed almost two times - from 15 to 30 percent

dstogov commented 2 years ago

@jirkace I didn't mean disabling opcache. Just JIT. opcache.jit=0

stevenbrookes commented 2 years ago

Just to add that I'm seeing the same behaviour on a full stack Symfony application.

PHP 8.1.4 opcache.jit=1255 FPM SISSEGV after 3 seconds opcache.jit=1205 FPM works opcache.jit=1255 CLI works opcache.jit=1205 FPM works

Application is large so very hard to isolate. Happy to send any other information needed though.

Gwemox commented 2 years ago

Has anyone seen the same issue in 8.1.5?

tanhaei commented 2 years ago

Has anyone seen the same issue in 8.1.5?

We have some problems with 8.1.5 release. Setting opcache.jit=0 resolves the problem temporary!!

cappadaan commented 2 years ago

Has anyone seen the same issue in 8.1.5?

The issue is still there and got even worse in 8.1.5. In 8.1.4 you could use opcache.jit=1205 as a workaround. But in 8.1.5 this also gave segfaults, only opcache.jit=0 works now for me.

nepster-web commented 2 years ago

Somewhere near https://github.com/php/php-src/issues/8149

jonathantullett commented 2 years ago

Has anyone seen the same issue in 8.1.5?

Yes, seeing the same issue in our symfony application. It's not immediate though and often can run for a few days before we start getting the segfaults across the board.

Gwemox commented 2 years ago

Has anyone seen the same issue in 8.1.5?

Yes, seeing the same issue in our symfony application. It's not immediate though and often can run for a few days before we start getting the segfaults across the board.

I think the problem has been found in https://github.com/php/php-src/issues/8461 !

brunohsouza commented 2 years ago

I have been trying to solve this problem. I'm using PHP 8.1.5-fpm + Nginx + Symfony 6.

After apply the purposed solution to change opcache.jit = tracing to opcache.jit=1205 it seems to be working.

What makes me wonder is that we have some environments using an environment variable like APP_ENV=dev (where the segfault is happening) and others using APP_ENV=staging (where the segfault is not happening). Even using the same versions of code, docker, libraries, etc.

Does someone knows if there is any correlation between these segfaults with the APP_ENV variable or any env var?

meinemitternacht commented 2 years ago

I applied the changes from #8461 to our environment using 8.1.6 as a base, and we are still experiencing segfaults. However, there may be multiple issues at play, and those fixes may alleviate some of the problems in this issue.

@brunohsouza I am not sure that the environment variables are impacting the problems with JIT, but they may cause the application to trigger certain behaviors depending on the environment settings.

brunohsouza commented 2 years ago

I just found another issue which maybe can be a clue for the problem with the APP_ENV variable: https://github.com/symfony/symfony/issues/45752.

My guess is as Symfony have already a pre-configured and defined environment variables like dev, test and prod which reflects on the directory structure inside /var/cache. When using JIT on tracing mode it is trying to optimize some "hot codes" inside some dynamic created file that is created on the /var/cache/{env} or inside some container class. Then, once JIT cannot find the file it generates a segfault.

The issue above can be related to it once the segfault happens when the cache folder is cleaned and doesn't happens when it is not.

Also, if I change the APP_ENV, I cannot see the segfault because there's no folder with the new env name. Then, it will not try to open a class or file.

After changing the jit mode to 1205 the segfaults had stopped, I think it's because it's trying to compile all functions on script load and not profile on the fly and compile traces for hot code segments as the documentation says.

trapiche-n commented 2 years ago

Hello! I have this "Segmentation fault" problem. I have php 8.1.6 with [opcache.jit = 1255] In my case it fails every time I make changes to the code of my app with an editor and save them!!! To make it work again I have to make a change (it can be a simple white space) in the code, with this the application works again...until the next change! PS: haven't tested with [opcache.jit = 1205] yet

usefksa commented 2 years ago

Hello, We also have the same problem. It happened totally random. If we run 10 servers, one of them will have the bug.

cappadaan commented 2 years ago

Issue seems fixed after updating to 8.1.7

our setting: opcache.jit=1255

nursoda commented 2 years ago

I still have this issue with 8.1.7 and opcache.jit=1255: After server reboot, one or more of my nextcloud instances constantly segfault/coredump. Setting opcache.jit from 1255 to 1205 reduces the impact but I still see two coredumps upon server restart. After commenting out opcache.jit and opcache.jit_buffer_size (assuming that disables JIT), I have no coredumps upon reboot. If I get some upon later reboots, I shall report here.

chelsEg commented 2 years ago

@cappadaan for my project issue not fixed after updating to 8.1.7

cappadaan commented 2 years ago

We indeed have experienced 1 segfault up till now, seems not 100% fixed

oleg-st commented 2 years ago

One fix was made in 8.1.7 (#8461), the second in 8.1.8 (#8591). The third related issue #8642 has not yet been fixed.

nicrame commented 2 years ago

For me

php81 -v

PHP 8.1.7 (cli) (built: Jun 7 2022 18:21:38) (NTS gcc x86_64) Copyright (c) The PHP Group Zend Engine v4.1.7, Copyright (c) Zend Technologies with Zend OPcache v8.1.7, Copyright (c), by Zend Technologies

In 10-opcache.ini i got: opcache.jit_buffer_size=0 opcache.jit=0 But it still crash: Jun 28 07:43:52 Love-NAS kernel: traps: php-fpm[5596] general protection fault ip:7ff464b6887c sp:7fffac4a98f0 error:0 in opcache.so[7ff464b4c000+e3000]

stissot commented 2 years ago

We had the same Opcache segmentation fault this morning on a server running PHP-fpm 8.1.8 and the following configuration. It completely filled up the system memory and swap. A php-fpm restart was needed.

Copyright (c) The PHP Group
Zend Engine v4.1.8, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.8, Copyright (c), by Zend Technologies

opcache.enable=1
opcache.enable_cli=0
opcache.fast_shutdown=0
opcache.interned_strings_buffer=16
opcache.jit=1255
opcache.max_accelerated_files=50000
opcache.memory_consumption=2048
opcache.revalidate_freq=0

Jul 22 05:24:00 web1 kernel: [626019.962719] php-fpm8.1[921996]: segfault at 7f946d1004d0 ip 00007f9501112bc0 sp 00007ffe55c12b50 error 6 in opcache.so[7f95010f4000+b5000]
Jul 22 05:24:00 web1 kernel: [626019.962743] Code: 89 3c 90 83 43 1c 01 f6 46 1c 08 74 25 48 8b 46 08 f6 40 04 20 74 1b 49 8b 56 18 80 7a 18 00 74 11 8b 00 49 8b 95 e0 01 00 00 <48> 89 34 02 0f 1f 40 00 49 83 c>
Jul 22 05:24:00 web1 kernel: [626019.962888] php-fpm8.1[947286]: segfault at 7f94764384d0 ip 00007f9501112bc0 sp 00007ffe55c12b50 error 6 in opcache.so[7f95010f4000+b5000]
Jul 22 05:24:00 web1 kernel: [626019.962906] Code: 89 3c 90 83 43 1c 01 f6 46 1c 08 74 25 48 8b 46 08 f6 40 04 20 74 1b 49 8b 56 18 80 7a 18 00 74 11 8b 00 49 8b 95 e0 01 00 00 <48> 89 34 02 0f 1f 40 00 49 83 c>

everyx commented 2 years ago

Same problem here with config below, php version 8.1.8

opcache.memory_consumption = 192
opcache.interned_strings_buffer = 8
opcache.max_accelerated_files = 4000
opcache.revalidate_freq = 60
opcache.fast_shutdown = 1
opcache.enable_cli = 1
opcache.jit = 1235
opcache.jit_buffer_size = 64M
opcache.preload_user = www-data

WARNING: [pool www] child 289717 exited on signal 11 (SIGSEGV - core dumped) after 42913.627688 seconds from start

gregherrell commented 2 years ago

We have a CMS application that powers thousands of individual websites. Outside of the individual design files on each websites, they all share the same centralized code base located in a folder on each web server. We run php-fpm. We have upgraded our servers to php 8.1.9 over the last week. CentOS 7, Apache 2.4.54

After each upgrade I delete all the old opcache files before restarting the server. Each server runs a single app pool using static mode and there are thousands of instances/websites of this application on each server.

What we are finding is that php-fpm begins to 503 on individual websites and not the entire server. Further, some of the pages on the websites will still serve while others 503. The 503 relates to the the "segfault in opcache.so errors" located in /var/log/messages. A restart of php-fpm clears the errors. I cannot find a pattern as to why.

Interestingly, I wondered if the somehow the local twig cache located on each website instance might be related. During one of the outages of a single website I ran a script to delete all the twig folder on the individual websites. This immediately resulted in 503 errors for all of the sites on the server.

Below are my settings. Note the buffer size is set to zero. Admittedly, I am confused on whether I should also set jit=disable to disable jit entirely. Or this is even a jit issue.

opcache.jit => tracing => tracing opcache.jit_bisect_limit => 0 => 0 opcache.jit_blacklist_root_trace => 16 => 16 opcache.jit_blacklist_side_trace => 8 => 8 opcache.jit_buffer_size => 0 => 0 opcache.jit_debug => 0 => 0 opcache.jit_hot_func => 127 => 127 opcache.jit_hot_loop => 64 => 64 opcache.jit_hot_return => 8 => 8 opcache.jit_hot_side_exit => 8 => 8 opcache.jit_max_exit_counters => 8192 => 8192 opcache.jit_max_loop_unrolls => 8 => 8 opcache.jit_max_polymorphic_calls => 2 => 2 opcache.jit_max_recursive_calls => 2 => 2 opcache.jit_max_recursive_returns => 2 => 2 opcache.jit_max_root_traces => 1024 => 1024 opcache.jit_max_side_traces => 128 => 128 opcache.jit_prof_threshold => 0.005 => 0.005

gregherrell commented 2 years ago

Here is a back trace.

For me, changing the buffer to > 0 generates immediate random segfaults. I realize this is not reproduceable, but perhaps it is something. I have several core dumps I can provide if need be.

Addedum. Got segfaults with JIT disabled after server was running for days.

0 zend_accel_inheritance_cache_find (needs_autoload_ptr=, traits_and_interfaces=, parent=, ce=, entry=0x41e6ccf0)

at /usr/src/debug/php-8.1.9/ext/opcache/ZendAccelerator.c:2254

1 zend_accel_inheritance_cache_get () at /usr/src/debug/php-8.1.9/ext/opcache/ZendAccelerator.c:2295

2 0x0000557e80fc366f in zend_try_early_bind () at /usr/src/debug/php-8.1.9/Zend/zend_inheritance.c:3021

3 0x0000557e80f09d93 in zend_do_delayed_early_binding (op_array=op_array@entry=0x7fe34ea02500, first_early_binding_opline=) at /usr/src/debug/php-8.1.9/Zend/zend_compile.c:1380

4 0x00007fe353b386d4 in zend_accel_load_script () at /usr/src/debug/php-8.1.9/ext/opcache/zend_accelerator_util_funcs.c:255

5 0x0000557e80ef1989 in compile_filename (type=type@entry=2, filename=filename@entry=0x7fe321c1d000) at /usr/src/debug/php-8.1.9/Zend/zend_language_scanner.c:707

6 0x0000557e80f6163a in zend_include_or_eval (inc_filename_zv=, type=2) at /usr/src/debug/php-8.1.9/Zend/zend_execute.c:4623

7 0x0000557e80f6e95a in ZEND_INCLUDE_OR_EVAL_SPEC_CV_HANDLER () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:38713

8 0x0000557e80f95516 in execute_ex () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:59122

9 0x0000557e80f21244 in zend_call_function () at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:908

10 0x0000557e80f21635 in zend_call_known_function () at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:997

11 0x0000557e80e27030 in spl_perform_autoload (class_name=0x7fe321c1cfb8, lc_name=0x7fe321d0d7e0) at /usr/src/debug/php-8.1.9/ext/spl/php_spl.c:433

12 0x0000557e80f2051c in zend_lookup_class_ex (name=name@entry=0x7fe321c1cfb8, key=0x7fe321d0d7e0, flags=flags@entry=512) at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:1141

13 0x0000557e80f21982 in zend_fetch_class_by_name () at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:1601

14 0x0000557e80f6ba0f in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:10147

15 0x0000557e80f944f4 in execute_ex () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:56659

16 0x0000557e80f9d22d in zend_execute (op_array=0x7fe34ea02200, return_value=0x0) at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:60123

dstogov commented 2 years ago

@gregherrell thanks for backtrace. Interesting, it doesn't contain any JIT code. It seems like something in shared inheritance cache was corrupted. I have no idea how this may be related to JIT yet. Could you try to run php with opcache.protect_memory=1 in php.ini. This should cause immediate crash in case of unintended write to shared memory and may give the next direction for analyses.

gregherrell commented 2 years ago

@gregherrell thanks for backtrace. Interesting, it doesn't contain any JIT code. It seems like something in shared inheritance cache was corrupted. I have no idea how this may be related to JIT yet. Could you try to run php with opcache.protect_memory=1 in php.ini. This should cause immediate crash in case of unintended write to shared memory and may give the next direction for analyses.

Unfortunately, this yielded a back trace with no information. I am not certain why. I generated 4 core dumps. All had the same information.

Reading symbols from /usr/sbin/php-fpm...Reading symbols from /usr/lib/debug/usr/sbin/php-fpm.debug...done. done. [New LWP 13843] Core was generated by `php-fpm: pool www '. Program terminated with signal 11, Segmentation fault.

0 0x00007fb54a44e29c in ?? ()

(gdb) bt

0 0x00007fb54a44e29c in ?? ()

1 0x0000000000007fff in ?? ()

2 0x7431bff8d9b1e4f2 in ?? ()

3 0x0000000000000000 in ?? ()

dstogov commented 2 years ago

Great :+1:

Most probably, you caught the crash in JIT code. Can you please try to dig down a bit more. The following commands would allow you to find the PHP script/function/line where the problem occurs

(gdb) p (char*)executor_globals.current_execute_data.func.op_array.filename.val
(gdb) p (char*)executor_globals.current_execute_data.func.op_array.function_name.val
(gdb) p executor_globals.current_execute_data.func.op_array.line_start
(gdb) p executor_globals.current_execute_data.opline.lineno
(gdb) p executor_globals.current_execute_data.opline - executor_globals.current_execute_data.func.op_array.opcodes

May be, I'll able to find the problem analysing the PHP source code and assembler code produced by JIT.

(gdb) disassemble  0x00007fb54a44e29c-100,  0x00007fb54a44e29c+100

You may post the info here or send it directly to dmitrystogov at gmail dot com

meinemitternacht commented 2 years ago

Maybe this Is related to #8642 and it can be fixed along with this one

MichaelSch commented 2 years ago

I'm adding myself to the list. I run several websites without issues, but on my Nextcloud instance php-fpm crashes randomly.

php -v
PHP 8.1.9 (cli) (built: Aug  2 2022 13:02:24) (NTS gcc x86_64)
Copyright (c) The PHP Group
Zend Engine v4.1.9, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.9, Copyright (c), by Zend Technologies

                Stack trace of thread 9677:
                #0  0x00007f94c951d13c zend_accel_inheritance_cache_get (opcache.so + 0x2113c)
                #1  0x00005603fda841f2 zend_do_link_class (php-fpm + 0x4841f2)
                #2  0x00005603fd9c91cf zend_bind_class_in_slot (php-fpm + 0x3c91cf)
                #3  0x00005603fd9c925d do_bind_class (php-fpm + 0x3c925d)
                #4  0x00005603fda22ac9 ZEND_DECLARE_CLASS_SPEC_CONST_HANDLER (php-fpm + 0x422ac9)
                #5  0x00005603fda5547d execute_ex (php-fpm + 0x45547d)
                #6  0x00005603fd9e07dc zend_call_function (php-fpm + 0x3e07dc)
                #7  0x00005603fd9e0bad zend_call_known_function (php-fpm + 0x3e0bad)
                #8  0x00005603fd8eb26a spl_perform_autoload (php-fpm + 0x2eb26a)
                #9  0x00005603fd9dfae1 zend_lookup_class_ex (php-fpm + 0x3dfae1)
                #10 0x00005603fda05791 is_a_impl (php-fpm + 0x405791)
                #11 0x00005603fda5cbd9 execute_ex (php-fpm + 0x45cbd9)
                #12 0x00005603fd9e07dc zend_call_function (php-fpm + 0x3e07dc)
                #13 0x00005603fd91d835 zif_call_user_func (php-fpm + 0x31d835)
                #14 0x00005603fda5cbd9 execute_ex (php-fpm + 0x45cbd9)
                #15 0x00005603fd9e07dc zend_call_function (php-fpm + 0x3e07dc)
                #16 0x00005603fd91d835 zif_call_user_func (php-fpm + 0x31d835)
                #17 0x00005603fda5cbd9 execute_ex (php-fpm + 0x45cbd9)
                #18 0x00005603fda5f0b9 zend_execute (php-fpm + 0x45f0b9)
                #19 0x00005603fd9eeeb0 zend_execute_scripts (php-fpm + 0x3eeeb0)
                #20 0x00005603fd989e5a php_execute_script (php-fpm + 0x389e5a)
                #21 0x00005603fd83e27d main (php-fpm + 0x23e27d)
                #22 0x00007f94c9a29550 __libc_start_call_main (libc.so.6 + 0x29550)
                #23 0x00007f94c9a29609 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x29609)
                #24 0x00005603fd83efd5 _start (php-fpm + 0x23efd5)
                ELF object binary architecture: AMD x86-64

nikserg commented 2 years ago

Probably similar problem. Repeated 502 on same page, which works fine on test and local servers.

From dmesg:

Aug 29 12:39:41 admin kernel: [363104.676254] traps: php-fpm8.1[41337] general protection fault ip:7f90c7c34cfc sp:7fffb6e946d0 error:0 in opcache.so[7f90c7c2f000+b5000]

PHP version:

php -v
PHP 8.1.9 (cli) (built: Aug 15 2022 09:39:52) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.9, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.9, Copyright (c), by Zend Technologies

Adding opcache.jit=0 in /etc/php/8.1/fpm/php.ini solves the problem.

php / php-src

JIT segmentation fault in PHP 8.1 #7817

Description

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

0 0x000055bbf04c0f25 in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.1/Zend/zend_vm_execute.h:10137

PHP Version

Operating System

php81 -v

0 zend_accel_inheritance_cache_find (needs_autoload_ptr=, traits_and_interfaces=, parent=, ce=, entry=0x41e6ccf0)

1 zend_accel_inheritance_cache_get () at /usr/src/debug/php-8.1.9/ext/opcache/ZendAccelerator.c:2295

2 0x0000557e80fc366f in zend_try_early_bind () at /usr/src/debug/php-8.1.9/Zend/zend_inheritance.c:3021

3 0x0000557e80f09d93 in zend_do_delayed_early_binding (op_array=op_array@entry=0x7fe34ea02500, first_early_binding_opline=) at /usr/src/debug/php-8.1.9/Zend/zend_compile.c:1380

4 0x00007fe353b386d4 in zend_accel_load_script () at /usr/src/debug/php-8.1.9/ext/opcache/zend_accelerator_util_funcs.c:255

5 0x0000557e80ef1989 in compile_filename (type=type@entry=2, filename=filename@entry=0x7fe321c1d000) at /usr/src/debug/php-8.1.9/Zend/zend_language_scanner.c:707

6 0x0000557e80f6163a in zend_include_or_eval (inc_filename_zv=, type=2) at /usr/src/debug/php-8.1.9/Zend/zend_execute.c:4623

7 0x0000557e80f6e95a in ZEND_INCLUDE_OR_EVAL_SPEC_CV_HANDLER () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:38713

8 0x0000557e80f95516 in execute_ex () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:59122

9 0x0000557e80f21244 in zend_call_function () at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:908

10 0x0000557e80f21635 in zend_call_known_function () at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:997

11 0x0000557e80e27030 in spl_perform_autoload (class_name=0x7fe321c1cfb8, lc_name=0x7fe321d0d7e0) at /usr/src/debug/php-8.1.9/ext/spl/php_spl.c:433

12 0x0000557e80f2051c in zend_lookup_class_ex (name=name@entry=0x7fe321c1cfb8, key=0x7fe321d0d7e0, flags=flags@entry=512) at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:1141

13 0x0000557e80f21982 in zend_fetch_class_by_name () at /usr/src/debug/php-8.1.9/Zend/zend_execute_API.c:1601

14 0x0000557e80f6ba0f in ZEND_NEW_SPEC_CONST_UNUSED_HANDLER () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:10147

15 0x0000557e80f944f4 in execute_ex () at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:56659

16 0x0000557e80f9d22d in zend_execute (op_array=0x7fe34ea02200, return_value=0x0) at /usr/src/debug/php-8.1.9/Zend/zend_vm_execute.h:60123

0 0x00007fb54a44e29c in ?? ()

0 0x00007fb54a44e29c in ?? ()

1 0x0000000000007fff in ?? ()

2 0x7431bff8d9b1e4f2 in ?? ()

3 0x0000000000000000 in ?? ()