Open rcanavan opened 7 years ago
Please use the latest version ?
The production servers will get APCu 5.1.8 next week, our test servers have it already. Have any deadlock bugs been fixed between 5.1.7 and 5.1.8?
A general memory corruption bug was, it's difficult to say it can or can't lead to deadlocks ... so testing would be best ;)
OK, so there's at least some chance that the root cause may be fixed.
I also neglected to grab a core dump of the hung processes, but I'm not sure if shared memory is included in those at all. Assuming I took a core dump next time, is there anything useful I can do analysing the APC shared memory?
Also, do you have any Idea how i can identify which process currently owns the lock, and if I can tell whether it legitimately owns the lock, or has failed to release it?
We've encountered another of those deadlocks, this time with apcu 5.1.8. Same backtrace and zbacktrace.
Any Idea about this: do you have any Idea how I can identify which process currently owns the lock, and if I can tell whether it legitimately owns the lock, or has failed to release it?
APCu 5.1.9, with PHP 7.1.13, deadlock on 3 servers approximately at the same time, all processes (on one server) have essentially the same backtrace, with identical key=key@entry=... but slightly different values passed to apc_cache_insert. According to zbacktrace, it's always the same apcu_store() call saving an array for various string keys. No iterator involved, however, apparently there are -3 readers, similar to #266:
85 subq $WRITERS_WAKEUP, %rdi
(gdb) x/20dw $rdi - 0x0c
0x7fbbb0e77088: 0 -3 412 374
0x7fbbb0e77098: 0 80 0 128
0x7fbbb0e770a8: 0 0 0 0
0x7fbbb0e770b8: 0 0 0 0
0x7fbbb0e770c8: 81 0 1 0
(gdb) frame 3
#3 0x00007fbbc764a7e5 in apc_cache_store (cache=0x29963b0, strkey=<optimized out>, val=0x7fbbce013890, ttl=0, exclusive=exclusive@entry=0 '\000')
at /root/build/build/src/external/apcu-5.1.9/apc_cache.c:552
552 /root/build/build/src/external/apcu-5.1.9/apc_cache.c: No such file or directory.
(gdb) print *(cache->header)
$2 = {lock = {__data = {__lock = 0, __nr_readers = 4294967293, __readers_wakeup = 412, __writer_wakeup = 374, __nr_readers_queued = 0, __nr_writers_queued = 80,
__writer = 0, __shared = 128, __pad1 = 0, __pad2 = 0, __flags = 0},
__size = "\000\000\000\000\375\377\377\377\234\001\000\000v\001\000\000\000\000\000\000P\000\000\000\000\000\000\000\200", '\000' <repeats 26 times>,
__align = -12884901888}, nhits = 0, nmisses = 81, ninserts = 1, nexpunges = 1, nentries = 1, mem_size = 1200, stime = 1519200448, state = 0, lastkey = {str = 0x0,
mtime = 0, owner = 0}, gc = 0x0}
pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:85
85 subq $WRITERS_WAKEUP, %rdi
Thread 1 (Thread 0x7fbbd1efb840 (LWP 7151)):
#0 pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:85
#1 0x00007fbbc7646959 in apc_lock_wlock (lock=<optimized out>) at /root/build/build/src/external/apcu-5.1.9/apc_lock.c:245
#2 0x00007fbbc764a2ab in apc_cache_insert (cache=cache@entry=0x29963b0, key=key@entry=0x7fffd50d9de0, value=0x7fbbb1437c68, ctxt=ctxt@entry=0x7fffd50d9e00, t=t@entry=1519200448, exclusive=exclusive@entry=0 '\000') at /root/build/build/src/external/apcu-5.1.9/apc_cache.c:948
#3 0x00007fbbc764a7e5 in apc_cache_store (cache=0x29963b0, strkey=<optimized out>, val=0x7fbbce013890, ttl=0, exclusive=exclusive@entry=0 '\000') at /root/build/build/src/external/apcu-5.1.9/apc_cache.c:552
#4 0x00007fbbc7647cf4 in apc_store_helper (execute_data=<optimized out>, exclusive=0 '\000', return_value=0x7fbbce013820) at /root/build/build/src/external/apcu-5.1.9/php_apc.c:495
#5 zif_apcu_store (execute_data=<optimized out>, return_value=0x7fbbce013820) at /root/build/build/src/external/apcu-5.1.9/php_apc.c:518
Just to check, did you see any relevant log output around that time? Crashed process, opcache reset, or something like that?
No messages, aside from one seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning ...
each second, presumably after the deadlock occurred, despite error_reporting = E_ALL
Are you using apcu_entry anywhere? Wondering if this might be a side-effect from #246 that only manifests on a later apcu_store call.
There's not even a single call to apcu_entry() in out entire codebase. Unlike #246, our problem is very rare (i.e. once every other month or so, with ~10 servers processing millions of requests per day), but usually results in multiple servers getting stuck within minutes at the most.
I think it's quite likely that this has the same root cause as #259, though your symptoms are somewhat different.
I agree with @nikic, I got all my servers stuck the same way (after a flush of APCu and a Varnish ban), resulting in many concurrent requests. 99% of the time it's okay, but sometimes one process is blocking others.
I've seen it with strace
, when you strace some of your PHP processes. For me it was PHP-FPM with more than 30 seconds responses, instead of milliseconds, so I had the time to write the PID (say 1234), run a strace -p 1234
, and find some information about it. For example:
futex(0xafcc1050, FUTEX_WAIT, 2, NULL <unfinished ...>
Then it means the 2nd file descriptor is blocking, which you can view by issuing a:
ls -al /proc/1234/fd/2
which points to the blocking file.
Is it possible to identify the process that is holding the lock (as opposed to waiting for it) with futexes, like it is via /proc/locks for flock() locks?
Good question :-) I don't know, but if you have a file descriptor, you can see what it's doing. At least that was how I detected the issue was in APCu and not in MySQL or something like that.
Hello there,
We seem to be affected by this issue in production. Using acpu version 5.1.8 on Debian 8.10 (Jessie), packaged from Dotdeb (5.1.8-1~dotdeb+8.1).
The trace from slowlog show all FPM workers hung when calling apcu_store()
Using the strace -p <worker-pid>
command always show the last syscall to futex(<uaddr>, FUTEX_WAIT, ...)
, where uaddr is always the same for all processes.
The work load of our web frontends in production is about 10 requests per second. Unfortunately it is hard for me to find a way to pull Nikita's patch into production, and I failed to reproduce the deadlock in our staging environment. Anyway, I will follow up on this issue and eventually report how it turned out.
PS: This was originally posted as a comment of #259 but I re-posted it here because it's not crashing any process, it's just hanging them all in a deadlock.
@mhouriet the fact is that along with the crash of #259 that was happening during deletions (iterating on many keys and deleting a lot of keys), my second issue was the lock and servers going down too. See https://github.com/krakjoe/apcu/issues/259#issuecomment-363730090 for instance.
But all those bugs may be related, one way or another. If you can compile without RW locks, you may see if it's better. I'll try @nikic patch soon, but one way or another, you'll have to compile it again (or perhaps DotDeb will do it). Although all new releases (5.19, 5.1.10) are labelled «PHP 7 bugfixes».
I just compiled the module from master (i.e. with @nikic patch from #259) and experimentally deployed it to one of our frontend servers in production. The deadlock occurred at least once every few days, so I'll be monitoring it carefully and report if it solved it for us or not.
If that fails, I will also try --disable-apcu-rwlocks as @dugwood suggested.
@rcanavan Apcu 5.1.11 is now released with a probable fix for this issue. Could you please check whether it works for you as well?
I'll try 5.1.11 as soon as possible. However, can you fix the headline of the release on https://github.com/krakjoe/apcu/releases ? It still says 5.1.10.
However, can you fix the headline of the release on https://github.com/krakjoe/apcu/releases ? It still says 5.1.10.
I see 5.1.11 there
@remicollet yeah, I changed it already
Unfortunately, I have to report a new deadlock occurrence with the patched module. All processes end up in the slowlog, doing nothing:
[0x00007f7d7be16550] apcu_store() /var/www/.../...php:494
[0x00007f7d7be16440] fetchFromApi() /var/www/.../...php:209
[0x00007f7d7be163c0] getSite() /var/www/.../...php:397
Using 5.1.11-dev with 7.0.27-1~dotdeb+8.1
root@...# strace -p 7186
Process 7186 attached
futex(0x7f7d4ee8b094, FUTEX_WAIT, 53650, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=22665, si_uid=0} ---
--- stopped by SIGSTOP ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=22665, si_uid=0} ---
futex(0x7f7d4ee8b094, FUTEX_WAIT, 53650, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=22665, si_uid=0} ---
+++ killed by SIGTERM +++
Output from make test
has been pasted here
Compiled from master with head at e9876acd49a537dc5a7e4b5ceef96f7dea3d3eae
I am upgrading to the latest PHP 7.2 on this server now and will post updates.
What was the process 53650
doing? As you can see, it's the blocking one:
futex(0x7f7d4ee8b094, FUTEX_WAIT, 53650, NULL)
But the other big issue to me is that you only have 4% of free memory for APCu, it's not a good test I think. Can you increase the memory? APCu nukes all entries when there's not enough memory left, so you must have at least 10 or 20% of free memory in the worst case.
Did you try without RW locks too? Note that I didn't have the time to test the new release in production, so I can't say if it's really better than 5.1.9 or not.
I will check the blocking process next time it occurs. I can increase the shm size but we use APCu for pure volatile, local caching purpose. The normal uptime cycle for this 512MB is about 30 minutes, i.e. APCu nuking all entries when the cache is full is actually expected behavior, and what naturally occurred on the other servers over the week-end. Only this one has entered this state and we see an abnormal uptime, perhaps this is not a coincidence and the deadlock occurred by the write(s) that needed flushing the cache.
Regarding disabling RW locks, I will look into this as well next time that I recompile the module.
Maybe the fragmentation did this. As I've detailed in https://github.com/krakjoe/apcu/issues/127#issuecomment-244362339, fragmentation is not an issue in general use cases, but if you're near the memory limit, I guess that APCu will need to find the little pieces left that are at least as big as the data you want to write. With a lot of fragmentation and very little space, this is a difficult job, and may end up in race conditions. This will happen before the «big nuke», which then makes things really easy. But to have the nuke, APCu must return an error on a set
that doesn't find enough space (again, it's a guess, maybe not the real behavior).
Even if a nuke is an expected behavior, you should avoid it as it's equivalent to a cold cache.
The issue from #259 was also related to cache expunge on full cache, so looks like we're still dealing with the same problem area... As @dugwood mentioned, it would be good to have a backtrace from the process holding the lock, maybe that will give some clues.
Looking through the code again, I see that apcu_entry still has a case that could deadlock if the cache runs full. @mhouriet Do you happen to use the apcu_entry API anywhere? Don't see any other problematic cases right now.
What was the process
53650
doing? As you can see, it's the blocking one:
futex(0x7f7d4ee8b094, FUTEX_WAIT, 53650, NULL)
Is that actually the PID? I'm getting 184
with all processes according to strace, but that PID has never been used by php-fpm since the master process has been started. On another system, val
is 180, and no php-fpm process has ever had the pid 180. The systems in question is still running 5.1.9
strace: Process 3347 attached
futex(0x7f2a27877094, FUTEX_WAIT, 184, NULL```
The information given by apc.php seem to be rather confused. The php-fpm master process was started on February 28th, and all php-fpm workers actually have that one as their parent, however, according to apc.php, everything just started pretty much exactly when the processes started to hang, and the cache is pristine:
APCu Version | 5.1.9 |
---|---|
PHP Version | 7.1.13 |
APCu Host | localhost (fit-asp14-sto04) (::1) |
Server Software | Apache |
Shared Memory | 1 Segment(s) with 256.0 MBytes (mmap memory) |
Start Time | 2018/03/13 16:10:37 |
Uptime | 56 minutes |
Cached Variables | 1 ( 1.2 KBytes) |
---|---|
Hits | 0 |
Misses | 294 |
Request Rate (hits, misses) | 0.00 cache requests/second |
Hit Rate | 0.00 cache requests/second |
Miss Rate | 0.09 cache requests/second |
Insert Rate | 0.00 cache requests/second |
Cache full count | 1 |
Memory Usage(multiple slices indicate fragments) | Hits & Misses |
---|---|
Free: 255.8 MBytes (99.9%) | Hits: 0 (0.0%) |
Used: 229.6 KBytes (0.1%) | Misses: 294 (100.0%) |
Fragmentation: 0.00% ( 0.0 Bytes out of 255.8 MBytes in 2 fragments)
@rcanavan the PID may change if it's another lock issue. For example you may have 1234 that's locking every other thread, then when the lock is released you'll have 2345 that's locking all threads.
As PHP-FPM children dies (after a certain amount of hits most of the time, so it's a natural death), they are created again. Or if you have too many idle threads (according to your PHP-FPM configuration, you can't have more than XX idle threads, or less than YY), the master process will create or destroy threads. So new PID, but the PHP-FPM master process will keep the same PID.
Perhaps APCu is reporting the date of creation of your PHP-FPM thread, not it's parent. If you refresh the page and see various dates, that must be it. So it's not important I think.
We're logging the PID for all requests in our php-fpm-access.log, and the numbers given by strace in the futex()
call aren't in any of logs up to 14 days back, but most of the PIDs of the running php-fpm processes are in the current log (except for the unlucky processes that were started and immediately ran into the deadlock). With pm.max_requests = 500 we're currently getting process lifetimes of about 5 minutes.
The 56 minutes Uptime are suspicious, that should be 14 days. It can't be the start time of the current process, since all processes were deadlocked and I had to kill a few to get my http request processed - otherwise I'd run into an apache timeout.
@rcanavan The start time is reset whenever the cache is (explicitly or automatically) cleared.
@nikic thanks, I was mistaken on this one. I have 1 week, 3 days, 11 hours and 8 minutes
for a PHP-FPM started 10d 11h 9m 34s
ago, so that's okay (as I never hit the clear state).
@rcanavan this is the PID at the exact time you have a lock. If the thread disappears, and release the lock, you won't see it. But if it last a few seconds, you must be able to see every file used by the thread by issuing:
lrwx------ 1 www-data www-data 64 mars 13 19:21 /proc/12362/fd/0 -> socket:[108105298]
lrwx------ 1 www-data www-data 64 mars 13 19:21 /proc/12362/fd/1 -> /dev/null
lrwx------ 1 www-data www-data 64 mars 13 19:21 /proc/12362/fd/2 -> /dev/null
lrwx------ 1 www-data www-data 64 mars 13 19:21 /proc/12362/fd/3 -> /tmp/.ZendSem.nJyjfo (deleted)
or view the status:
Name: php-fpm7.0
Umask: 0022
State: S (sleeping)
Tgid: 12362
Ngid: 0
Pid: 12362
PPid: 31110
TracerPid: 0
or run a strace
on the PID to find out what it's doing.
But I agree that without a crash, it's harder to find what's going on.
@rcanavan The start time is reset whenever the cache is (explicitly or automatically) cleared.
OK, so the seems busy (you may need to increase pm.start_servers
messages start 4 seconds after the reset and repeat every second until server reached pm.max_children
.
@rcanavan this is the PID at the exact time you have a lock. If the thread disappears, and release the lock, you won't see it. But if it last a few seconds, you must be able to see every file used by the thread by issuing:
That would be somewhat surprising. If a PHP-FPM process ever had successfully processed a request, it should have left a line in our phpfpm-access.log, with a corresponding field pid:"6399"
. If the process was still running - well I should be able to find it with ps and inspect it with strace and gdb. If it crashed, I'd expect something like child 16937 exited on signal ...
in the php_error.log. I see no such trace of the alleged PID, or for that matter, anything even close in any of the logs. Given how quickly we churn through PHP-FPM children (unlike any other processes on those servers), some nearby PIDs should be present somewhere close in time in any of the logs, but they are not.
@dugwood @rcanavan I've just checked the glibc rwlock implementation to find out what exactly it is FUTEX_WAITing for. As @rcanavan already suspected, this is not the pid. In fact, the value is pretty much meaningless, it is a counter that is incremented every time a FUTEX_WAKEUP is performed. So the only thing this number tells you is how many times there was an attempt to acquire a write lock that was not immediately satisfied. The tid of the lock holder (in case of a write lock that is) is stored in __data.__writer
.
And __data.__writer = 0
in my gdb backtrace above, so that won't help us find the cause. However __nr_readers = 4294967293
still smells fishy.
@rcanavan Assuming that this is really the same issue as in #259, those values look sensible. The problem sequence is a wlock wlock wunlock wunlock on one thread. The second wlock will EDEADLK, thus the first wunlock will unlock the write lock (setting writer=0). The second wunlock will then decrease nr_readers, making it go negative. So at least writer=0, nr_readers=-1 would make sense in the situation. Not totally sure about the -3 though.
The reason for the -3 readers would be that the rwlock implementation has overflow protection for the number of readers. So when writer=0, nr_readers=-1 trying to acquire an rlock will EAGAIN, so that the conjugated wunlock ends up decrementing __nr_readers further.
@nikic you're right, that's not a PID! It's the file descriptor of the PID you're strace'ing. So if you're running a strace on PID=1234, and the futex says 5678, you should look at /proc/1234/fd/5678
. Issuing a cat
on this file will reveal the blocking thread/process.
@rcanavan note that ps aux
don't always reveal all threads. For example I use Varnish, and ps auxf
will show at most 3 or 4 master processes. But if I run pstree
, then I'll see 800 cache-worker
(each as a dedicated pid), and many others. But you're right, for the PHP-FPM processes, I can see all of them in ps aux
.
I've added checks for failed wlock operations in https://github.com/krakjoe/apcu/commit/885ea551d7ee548189c868e58873b2610d483043. If a wlock fails we print a warning and abort the operation. That will at least prevent us from corrupting the lock state if a deadlock is detected.
@nikic So using the master branch to test this error?
We had this issue, the behavior is:
php-fpm.slow.log
latest N lines is about apcu_store
, and can not handle any requests(nginx 499).
And the php-fpm.log
shown that reached the max execution time, terminated. But there is no slow log printed anymore.
@lichunqiang Which version are you currently using? I'd suggest trying the 5.1.11 release first (which definitely fixes a apcu_store deadlock). The additional changes on master may help with debugging and limiting damage, but they don't fix any specific issue.
@nikic APCu 5.1.9 and PHP7.1.10
Thanks for your opinion, we will try the latest 5.1.11 release.
Does anybody here still experience deadlock issues with APCu 5.1.11? If so, I'd suggest trying the new 5.1.12 release as well.
@nikic I've just installed 5.1.12 (without disable-rw-locks
argument for compilation), and still have big issues with APCiterator
. If I run cache invalidation requests concurrently, everything goes down in a few seconds. For example:
$apc = new APCUIterator('/cache-server-[0-9]+-timed-key-[0-9]+/', APC_ITER_KEY, 1000, APC_LIST_ACTIVE);
while (($entry = $apc->current()))
{
$apc->next();
if (apcu_delete($entry['key']))
{
$result++;
}
}
will crash my server (with 5.1.9 or 5.1.12) when 3 threads are running on this code (perhaps only 2 will suffice, I don't know exactly).
It's not a crash per say, but requests are piling up. So far my options are:
apcu_delete
instead of the iterator (as I can find out what the timestamp is, it's stored in APCu)Any clue? Or I have to reproduce in development hoping to get a crash and backtrace?
An old virtual machine was still using 5.1.9 with the disable-rw-locks
option, and crashed the same way. So RW locks shouldn't be the culprit.
Does anybody here still experience deadlock issues with APCu 5.1.11? If so, I'd suggest trying the new 5.1.12 release as well.
Yes, just today we have apc_store deadlock at our production with 5.1.11 and php 7.2.7-1+ubuntu16.04 Will try 5.1.12 today.
Hi @mightydok
Did upgrade to 5.1.12 solve your issue? We upgraded to 5.1.12 but still having the deadlock issue.
For anyone still seeing a deadlock, would it be possible to provide a stack trace from a deadlocked process?
Hi @nikic
attaching gdb output from two stuck processes:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f3cfdd4d450 in futex_wait (private=
[New LWP 22170]
[New LWP 22171]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f3cfdd4d450 in futex_wait (private=
This ultimately leads to MaxRequestWorkers reached error in Apache and this resolves only after restarting Apache.
PHP Version: 7.0.30-1~dotdeb+8.1 APCu Version: 5.1.12 Kernel: 3.16.36-1+deb8u1
Hi all:
For anyone running into deadlocks:
Please update to at least glibc 2.26 or use --disable-apcu-rwlocks.
Reason:
I debugged with glibc 2.23 and was able to consistently reproduce the deadlocks.
However in my case the writer sits before the non-taken lock and just not grabs it or just not releases it.
This might be due to the case of it not receiving a special signal to be scheduled again, because gdb'ing the process and then calling continue usually fixes the problem -- even though they did make zero progress before.
However in glibc-2.26 the rwlock implementation has been completely rewritten using cas
semantics and looks overall more robust.
Therefore testing with newer glibc is strongly recommended.
Yesterday, 4 (out of 12) of our production servers stopped responding within about 20 minutes of each other. On all four, all php-fpm processes that I have inspected were stuck in apcu_store(). All systems use APCu 5.1.7 with PHP 7.0.13 on CentOS 7.3. I'm looking for hints what additional information I should collect to help isolate the root cause in case this problem ever occurs again. For example, how can I determine which process is currently holding the write lock?