Open TomZhuPlanetart opened 1 year ago
I'm wondering whether it's possible that after one of the php process acquired the lock, and then it was killed by OOM, the lock held by it could not be released, and finally caused all processes being locked.
Yes, unfortunately this is possible. APCu uses POSIX rwlocks, which do not support robustness. POSIX only supports robust mutexes.
APCU Version: 5.1.22 PHP Version: 8.1.9
Today two of our production servers stopped responding. Nginx kept returning 502, and the error log had this error:
The number of php-fpm process reached the max_children. I spot checked the call stack of several of the fpm processes, all show me something like below:
I then checked system log with following command , it seems several fpm processes were killed by OOM.
MY PHP Code has a snippet of code using APCU like this:
I'm wondering whether it's possible that after one of the php process acquired the lock, and then it was killed by OOM, the lock held by it could not be released, and finally caused all processes being locked.