zendtech / ZendOptimizerPlus

Other
914 stars 142 forks source link

Opcache not reclaiming wasted memory #187

Open robfico opened 10 years ago

robfico commented 10 years ago

PHP 5.4.29, Zend Opcache 7.0.3, php-fpm using mod_fastcgi, Apache 2.2.27

php -v

PHP 5.4.29 (cli) (built: Jun 26 2014 11:42:36) Copyright (c) 1997-2014 The PHP Group Zend Engine v2.4.0, Copyright (c) 1998-2014 Zend Technologies with the ionCube PHP Loader v4.6.1, Copyright (c) 2002-2014, by ionCube Ltd., and

with Zend OPcache v7.0.3, Copyright (c) 1999-2014, by Zend Technologies

phpinfo: opcache.max_wasted_percentage 5 opcache.memory_consumption 96 opcache.force_restart_timeout 180 Used memory 31240312 Free memory 20386576 Wasted memory 49036408 Cached scripts 598 Cached keys 1112 Max keys 7963 OOM restarts 0 Hash keys restarts 0 Manual restarts 0

The percentage of wasted memory is way above 5% (it's at 48.71%), but a restart is not being done to reclaim the wasted memory. Shouldn't is auto restart to get wasted memory under 5%?

TerryE commented 10 years ago

@Robfico, You can use a tiny php script triggered by a cron job and wget to do this using the OPcache API. Why bother to try to second-guess the policy of the local sysadmin by trying to embed this functionality in OPcache?

robfico commented 10 years ago

Sure, I can do this (I also manage the server). But I thought the docs state that if the memory wasted percentage exceeds "opcache.max_wasted_percentage 5" then it should auto restart:

The maximum percentage of wasted memory that is allowed before a restart is scheduled.

I'm just asking is this a bug, or do I not understand this setting?

iHeadRu commented 9 years ago

You have free memory, thus no need to restart the cache.

arjenschol commented 9 years ago
Key Value
Opcode Caching Up and Running
Optimization Enabled
Startup OK
Shared memory model mmap
Cache hits 63055239
Cache misses 238909
Used memory 133687408
Free memory 24
Wasted memory 530296
Interned Strings Used memory 6278120
Interned Strings Free memory 2110488
Cached scripts 5398
Cached keys 7000
Max keys 7963
OOM restarts 0
Hash keys restarts 0
Manual restarts 0

Free memory is 24 bytes, which is essentially Out of Memory cause no script could ever fit in 24 bytes. However no OOM restart is performed. Is this only performed when we have exactly 0 bytes left?

arjenschol commented 9 years ago

Ah, even when out of memory, a restart is only scheduled when wasted_memory/memory_consumption > max_wasted_percentage.

Maybe change zend_accel_schedule_restart_if_necessary to always schedule a restart when OOM and wasted_memory > 0?

nathanhruby commented 9 years ago

Hi,

I just discovered this same issue. A full cache as indicated by opcache doesn't issue a restart. I definitively support the idea of doing a restart / clear of the cache when it is full and either wasted percentage is non-zero or hit rate is less than some configurable threshold.

I see in the statistics section of opcache_get_status() there is a "oom_restarts" which indicates to me that this should be possible.

Any thoughts?

TerryE commented 9 years ago

A restart causes a temporary slowdown as the cache is reprimed. There is now reuse or LRU-type algos in current opcache, so if the cache is just too small for the active "working set" of scripts, then a default restart policy will just cause repeated restarts. IMO, better to use a PHP admin script and the Opcache Management API to monitor this and then any restarts can be rescheduled within the local Sysadmin policy as well as a review of Opcache SMA allocation policies.

zsprackett commented 9 years ago

What about adding another call in ZO like this one:

/* {{{ proto void opcache_invalidate(string $script [, bool $force = false])  Invalidates cached script (in necessary or forced) */
static ZEND_FUNCTION(opcache_invalidate)

Instead of taking a filename, it could take a timestamp allowing you to invalidate the cache by least recently used. This could be more efficiently handled in the module than retrieving all the cache entries in PHP and iterating through them one by one to invalidate the cache.

SjonHortensius commented 9 years ago

@TerryE you cannot seriously suggest a script monitoring the statistics should be responsible for a cache that is unable to implement a proper LRU flushing mechanism. Currently; old files will clog the cache even though those files are never looked at again, while new files that actually need caching will be rejected.

nathanhruby commented 9 years ago

Hi Terry,

In this particular case I am the local sysadmin and the policy I was hoping to set was "dump the cache when full and hitrate is below a low watermark" because, while a course tool for cache management, it's easier to implement as a stopgap versus a proper LRU cleaner (which would be the ideal solution).

I'm aware of the performance impact of having to rebuild the cache as well as the fact that this strategy will cause multiple restarts over time when the working set is dramatically larger than available cache size. We see this with APC already and have factored that into our designs as best we can.

IMHO, having individual sites write cache management code to watch behind opcache's back doesn't seem sensible. Opcache needs some better tools to manage this fairly common situation. Zac's comment about a time based variant of opcache_invalidate() seems like a good step in the correct direction, as it seems like the data is already in the hash bucket for each entry.

TerryE commented 9 years ago

@SjonHortensius, Not quite, I am saying that Opcache currently has a feature to restrict admin scripts to a known location path, and that such scripts need not be limited to monitoring only. AFIAK, this is the only mechanism within the Opcache functionality to achieve this desired functionality.

IMO the two big functional gaps in Opcache are: (i) That the Zend engine 2 and 3 oparray formats are not PIC, so op_arrays can only be moved with some relocation algo. (ii) The Zend engine has no light weight in-use marking scheme, so can't detect and mark as disused. Without (ii) we can't easily tag dead or inactive script op_arrays for reuse; and (i) means that we need to be careful about SMA reallocation.

In the late 80s I used to develop a network OS file-system which used a similar allocation scheme to the Opcache SMA allocator. IMO, there is no reasonable argument for doing this today -- other than it was simple to implement and fast.

I sorted the relocation issues for my persistent MLC opcache fork (this is now moribund as I just could get traction with the dev team on this -- phpng was their priority). Also note that (i) means that SMAs can't be used safely on Windows architectures, IMO.

I am taking a break from Opcache contributions at the moment, as I felt that I couldn't achieve an effective level in terms of contributions that would make it to the core, to make this effort worthwhile for me. (I am also too busy building myself a new house.) However, this is something that I might come back to later if the team don't pick it up in the meantime.

craigjbass commented 8 years ago

Without going full Linus, this seems like a weird response.

1) Wasted memory should be reclaimed by PHP. 2) A sysadmin should not expect to have to reclaim memory manually that is managed by PHP

3) I'll repeat that again. This is not the job of the sysadmin.

haydenjames commented 8 years ago

I think only if free memory hits 0 then it will restart? In which case the wasted memory setting is pointless because opcache always restarts when it runs out of memory. Am I misunderstanding?

craigjbass commented 8 years ago

@haydenjames we have experienced segfaults in production due to this. The only cure is to restart php-fpm and hope the failure does not occur on too many nodes at once!

SjonHortensius commented 8 years ago

@haydenjames there are 2 issues here:

It seems terrible to need to fix this in userspace but apparently this won't be fixed otherwise.

craigjbass commented 8 years ago

@SjonHortensius @haydenjames at least for our usecase, Opcache should flush if it fails to add a file and try again. Generally a flush removes a lot of dead entries, especially when you use the Red-Green deployment pattern.

JohanTan commented 8 years ago

+1 on this. It appears to be persist on PHP 7 as well. We even run into the issue on production where there is very little free memory but opcache still does not restart, and yet it got turned off and cause the CPU to skyrocket and everything slows down by orders of magnitude.

image

In the above result we have opcache.max_wasted_percentage=5. But as you can see it's hitting over 57% and still not restarted. Reload/restart php-fpm fixed the problem. But agreed with Craig, this should not manually handled by sysadmin.

jetscale commented 8 years ago

+1. Currently experiencing the same behaviour.

sparc commented 8 years ago

+1 It's really surprising that there is no way to invalidate old cache entries and make availalbe again the memory they used

pixelchutes commented 8 years ago

I believe I'm currently facing this scenario as well. PHP 5.6.19 / OPcache 7.0.6-dev.

In my case, something appears to "go wrong" (still trying to pinpoint) with one of the php-fpm pool processes (pid=1557). When OPcache eventually runs out of free_memory / surpassing current_wasted_percentage threshold, it schedules an oom_restart. _(But it's permanently stuck in restart_pending=true state!)_

It seems OPcache is aware of the problem, and even attempts to kill the impacted process preventing a successful opcache restart (e.g. killed locker in error logs)

[08-Jun-2016 13:50:23] WARNING: [pool www] child 1557 said into stderr: "*** Error in `php-fpm: pool www': realloc(): invalid next size: 0x00000000034ab230 ***"
...
[09-Jun-2016 13:07:12] WARNING: [pool www] child 29169 said into stderr: "Thu Jun 9 13:07:12 2016 (29169): Error Killed locker 1557"

However, the impacted process is not actually killed. When this happens, the only way to resolve so far has been to force-quit php-fpm, or send "SIGKILL" signal to the impacted php process (opcache locker?) e.g. kill -9 1557

Immediately upon killing the impacted process(es), OPcache proceeds with it's automatic scheduled restart, and the memory is reclaimed. Unfortunately, this currently requires manual intervention to do so.

artursitarski commented 8 years ago

+1

bradjorgensen commented 7 years ago

+1

poolerMF commented 7 years ago

+1

jliebert commented 7 years ago

+1 I believe I'm currently facing this scenario as well. PHP 5.5.37-1~dotdeb+7.1 with OpCache 7.0.6-dev

chowhwei commented 7 years ago

+1

gessulatgessulat commented 6 years ago

+1

komapa commented 6 years ago

+1

if-kenn commented 6 years ago

Wow, just discovered this fun. With modern frameworks like Symphony that literally use thousands of php files, this is a show stopper.

TerryE commented 6 years ago

Certainly as of the pre PHP 7.0 versions (and I anticipate as the current versions), "Wasted memory" is a misnomer. When modules are replaced with a new version, any more includes of that resolved path will bind to the new version and deep copy its R/W data. However the old version(s) may still be bound into and used by other executing PHP scripts. Even if we had a new "available" category, the PHP engine doesn't implement a reference counting, mark and sweep, or other mechanism to track what processes or thread are using what opcache resources.

I had a beta version working with 5.6 where the separate PHP threads "opened" and "closed" their connection to the SMA, and given that script life is typically in seconds and rarely longer than 1 minute, you could then make the conservative assumption that any wasted script older than the oldest running PHP thread could be safely moved from wasted to available, and this worked fine. The issue that this left was one of the overhead of fragmented storage management, which is standard and quite solvable.

There is nothing in principle stopping the core Zend team doing this nor for that matter implementing a file-based opcache to address the shared market / cgi sector other than funding and priorities, of course. But I ultimately decided that it is really is a waste of my time as a community contributor trying to work with the Zend team, as information flow is only one-way and you can find months of work on a contribution suddenly rendered out of date by one of their reveals.

So as far as I know they have made zero progress on this in the 4 years this issue has been known and festering. But we won't know any difference until some future PHP 7.x release includes an announcement that Opcache now includes space recovery.

Dmitry @dstogov is the one who does know what is happening

PS. Factual correction as pointed out by The man :clap:

rlerdorf commented 6 years ago

The file cache option has been available for years. See opcache.file_cache and related settings.

TerryE commented 6 years ago

@rlerdorf Rasmus, my apologies, and thanks for picking this up. This shows you how out of contact I am. :scream: I have corrected the point. My thanks to the Zend team for doing this. It's good to know that my prototype (8d3ad42) and performance data gave Dmitri the impetus to re-implement this and carry on this work (3abde43). Incidentally whilst I mostly use dedicated and VPS platforms, I do use two shared service providers that offer a php7.x-cgi execution model and neither enables file-base Opcache which is why I missed this (though I've just checked a 3rd and it does). Perhaps a bit more evangelism of this feature is needed

However, this aside isn't directly relevant to the topic of this thread. IMO, the lack of resource reclamation in Opcache is a major scaling issue, and one that it would sensible to address. It is an issue that is quite tractable and without any material performance impact so long as the configuration limits script execution times. SMA fragmentation can also be addressed by a sweep and bounce algo to remove fragmentation blockers, and here there would be a role for a second level file-cache.

This isn't a big job -- maybe a couple of months work for a bright programmer under supervision and most of this would be learning curve on the Zend core and Opache.