zendtech / ZendOptimizerPlus

Other
914 stars 142 forks source link

current_wasted_percentage > max, restart_pending=1 but no restart #220

Closed SjonHortensius closed 8 years ago

SjonHortensius commented 8 years ago

We have this issue frequently on our server, when the opcache is full, nothing happens. Our opcache settings:

opcache.memory_consumption=512
opcache.max_accelerated_files=32531
opcache.interned_strings_buffer=32
disable_functions=opcache_reset
opcache.log_verbosity_level=4
opcache.revalidate_freq=0

This is what opcache_get_status looks like just before it's full:

[opcache_enabled] => 1
[cache_full] => 
[restart_pending] => 
[restart_in_progress] => 
[memory_usage] => Array
    (
        [used_memory] => 419327312
        [free_memory] => 20042160
        [wasted_memory] => 97501440
        [current_wasted_percentage] => 18.161058425903
    )
[interned_strings_usage] => ...
[opcache_statistics] => Array
    (
        [num_cached_scripts] => 14739
        [num_cached_keys] => 26554
        [max_cached_keys] => 32531
        [hits] => 601914
        [start_time] => 1448463902
        [last_restart_time] => 0
        [oom_restarts] => 0
        [hash_restarts] => 0
        [manual_restarts] => 0
        [misses] => 27077
        [blacklist_misses] => 0
        [blacklist_miss_ratio] => 0
        [opcache_hit_rate] => 95.695168929285
    )

When it actually is full, the status changes to:

[opcache_enabled] => 
[cache_full] => 
[restart_pending] => 1
[restart_in_progress] => 
[memory_usage] => Array
    (
        [used_memory] => 433689008
        [free_memory] => 95512
        [wasted_memory] => 103086392
        [current_wasted_percentage] => 19.201336801052
    )
[interned_strings_usage] => ...
[opcache_statistics] => Array
    (
        [num_cached_scripts] => 15316
        [num_cached_keys] => 27261
        [max_cached_keys] => 32531
        [hits] => 645231
        [start_time] => 1448463902
        [last_restart_time] => 0
        [oom_restarts] => 0
        [hash_restarts] => 0
        [manual_restarts] => 0
        [misses] => 28566
        [blacklist_misses] => 0
        [blacklist_miss_ratio] => 0
        [opcache_hit_rate] => 95.760444169386
    )

Nothing happens after this; even after weeks of running it will hang like this. I've added opcache.log_verbosity_level=4 but nothing was logged. It's also weird that the status no longer includes a list of files that are in the cache; maybe that was truncated already?

dstogov commented 8 years ago

log messages goes to stderr, by default. You may redirect to file using "opcache.error_log" directive. Then search for "Restarting!" string in the log.

On Thu, Nov 26, 2015 at 1:49 PM, SjonHortensius notifications@github.com wrote:

We have this issue frequently on our server, when the opcache is full, nothing happens. This is what _opcache_getstatus looks like just before it's full:

[opcache_enabled] => 1 [cache_full] => [restart_pending] => [restart_in_progress] => [memory_usage] => Array ( [used_memory] => 419327312 [free_memory] => 20042160 [wasted_memory] => 97501440 [current_wasted_percentage] => 18.161058425903 ) [interned_strings_usage] => ... [opcache_statistics] => Array ( [num_cached_scripts] => 14739 [num_cached_keys] => 26554 [max_cached_keys] => 32531 [hits] => 601914 [start_time] => 1448463902 [last_restart_time] => 0 [oom_restarts] => 0 [hash_restarts] => 0 [manual_restarts] => 0 [misses] => 27077 [blacklist_misses] => 0 [blacklist_miss_ratio] => 0 [opcache_hit_rate] => 95.695168929285 )

When it actually is full, the status changes to:

[opcache_enabled] => [cache_full] => [restart_pending] => 1 [restart_in_progress] => [memory_usage] => Array ( [used_memory] => 433689008 [free_memory] => 95512 [wasted_memory] => 103086392 [current_wasted_percentage] => 19.201336801052 ) [interned_strings_usage] => ... [opcache_statistics] => Array ( [num_cached_scripts] => 15316 [num_cached_keys] => 27261 [max_cached_keys] => 32531 [hits] => 645231 [start_time] => 1448463902 [last_restart_time] => 0 [oom_restarts] => 0 [hash_restarts] => 0 [manual_restarts] => 0 [misses] => 28566 [blacklist_misses] => 0 [blacklist_miss_ratio] => 0 [opcache_hit_rate] => 95.760444169386 )

Nothing happens after this; even after weeks of running it will hang like this. I've added opcache.log_verbosity_level=4 but nothing was logged

— Reply to this email directly or view it on GitHub https://github.com/zendtech/ZendOptimizerPlus/issues/220.

SjonHortensius commented 8 years ago

I realize that; fpm runs under systemd so all output (to stdout/err) is sent to the journal. However, I see no messages there. I'll try opcache.error_log; see if that works.

Update; that produces output; I'll update this issue tomorrow with output from the log

SjonHortensius commented 8 years ago

This machine is actively used; but only after reloading my opcache_get_status script did I get this error in the log (implying an opcache_get_status is needed to trigger this):

Warning Forced restart at 1448619309 (after 180 seconds), locked by 15853
Error Killed locker 15853

Instead of normal output I got a gateway error from nginx: upstream sent unexpected FastCGI record: 3 while reading response header from upstream. ps shows the pid in question has a defunct child, which is probably why it cannot be killed:

http     15853 12038  0 Nov26 ?        00:00:03 php-fpm: pool www
http     15879 15853  0 Nov26 ?        00:00:00 [svn] <defunct>

Manually executing kill -SIGKILL 15853 works however, and this seems to complete the restart

SjonHortensius commented 8 years ago

So lets conclude that a php-fpm process with a defunct child will not be restarted although the child is not in a zombie state. When the child has successfully been stopped (manually) the opcache will not attempt to restart and fpm needs to be restarted manually.

It's okay with me to keep this as a known issue.