nextcloud / all-in-one

📦 The official Nextcloud installation method. Provides easy deployment and maintenance with most features included in this one Nextcloud instance.
https://hub.docker.com/r/nextcloud/all-in-one
GNU Affero General Public License v3.0
5.41k stars 622 forks source link

PHP OPcache-Modul not configured correctly - error message after update #2755

Closed GUNT0815 closed 1 year ago

GUNT0815 commented 1 year ago

Steps to reproduce

update to NC-AIO 6/26.0.2

Expected behavior

PHP OPcache-Modul ist richtig konfiguriert

Actual behavior

"Das PHP OPcache-Modul ist nicht richtig konfiguriert. Weitere Informationen finden Sie in der Dokumentation ↗. Der OPcache-Puffer ist fast voll. Um sicherzustellen, dass alle Skripte im Cache gehalten werden können, wird empfohlen, opcache.memory_consumption auf Ihre PHP-Konfiguration mit einem höheren Wert als 128 anzuwenden."

szaimen commented 1 year ago

cc @MichaIng should we increase this to 256?

MichaIng commented 1 year ago

@GUNT0815 Are you punkyard in the Nextcloud forum or are two people having the same issue?

Thing is: It should be impossible for Nextcloud to fill 128 MiB of OPcache, even with all apps installed and 32 MiB internal strings buffer. Is there a way you can hack the container, put opcache-gui inside, access it via browser and check whether:

  1. OPcache size really effectively is 128 MiB
  2. It verifies >90% usage
  3. Only Nextcloud scripts are cached, or is there something else shopped with the AIO container?
szaimen commented 1 year ago
  • OPcache size really effectively is 128 MiB

It should be since it is the default, no? (we dont overwrite the default that is set by php)

  • Only Nextcloud scripts are cached, or is there something else shopped with the AIO container?

Only Nextcloud runs within the Nextcloud container which uses PHP.

MichaIng commented 1 year ago

It should be indeed, but still good to check it. It was some Nextcloud versions ago, but I did test how much OPcache is used when enabling literally all available apps from the shop (which are not conflicting), accessing all their GUIs and leaving it for a while. And I was only able to raise usage to around 64 MiB, including the interned strings buffer. So hard to believe that suddenly >100 MiB is used by an actual non-constructed Nextcloud instance, without any unintended cause.

szaimen commented 1 year ago

I get the following stats on my aio-testing instance: image image

szaimen commented 1 year ago

Based on my stats above it does not look impossible to reach the limit. WDYT @MichaIng?

szaimen commented 1 year ago

Maybe increasing to 192 would make most sense? Then interned_strings_buffer should be 48 to silence any warning.

MichaIng commented 1 year ago

Woah that is much. I have one NC 25.0.7 instance with 38.97 MiB OPcache usage and the largest cached script being 3rdparty/scssphp/scssphp/src/Compiler.php with 774 KiB:

  - activity: 2.17.0
  - calendar: 4.4.2
  - circles: 25.0.0
  - cloud_federation_api: 1.8.0
  - comments: 1.15.0
  - contactsinteraction: 1.6.0
  - dashboard: 7.5.0
  - dav: 1.24.0
  - federatedfilesharing: 1.15.0
  - federation: 1.15.0
  - files: 1.20.1
  - files_pdfviewer: 2.6.0
  - files_rightclick: 1.4.0
  - files_sharing: 1.17.0
  - files_trashbin: 1.15.0
  - files_versions: 1.18.0
  - groupfolders: 13.1.3
  - logreader: 2.10.0
  - lookup_server_connector: 1.13.0
  - notifications: 2.13.1
  - oauth2: 1.13.0
  - photos: 2.0.1
  - privacy: 1.9.0
  - provisioning_api: 1.15.0
  - ransomware_protection: 1.14.0
  - related_resources: 1.0.4
  - richdocuments: 7.1.4
  - richdocumentscode: 23.5.5
  - serverinfo: 1.15.0
  - settings: 1.7.0
  - sharebymail: 1.15.0
  - spreed: 15.0.6
  - survey_client: 1.13.0
  - systemtags: 1.15.0
  - text: 3.6.0
  - theming: 2.0.1
  - twofactor_backupcodes: 1.14.0
  - updatenotification: 1.15.0
  - user_status: 1.5.0
  - viewer: 1.9.0
  - weather_status: 1.5.0
  - workflowengine: 2.7.0

Second is a NC 27.0.0 with 31.63 MiB OPcache usage and the largest cached script being apps/dav/lib/CalDAV/CalDavBackend.php with 312 KiB (which is the second largest on the first instance):

  - activity: 2.19.0
  - calendar: 4.4.2
  - cloud_federation_api: 1.10.0
  - contacts: 5.3.1
  - dashboard: 7.7.0
  - dav: 1.27.0
  - federatedfilesharing: 1.17.0
  - files: 1.22.0
  - files_rightclick: 1.6.0
  - files_trashbin: 1.17.0
  - files_versions: 1.20.0
  - impersonate: 1.14.0
  - logreader: 2.12.0
  - lookup_server_connector: 1.15.0
  - notes: 4.8.0
  - notifications: 2.15.0
  - notify_push: 0.6.3
  - oauth2: 1.15.0
  - photos: 2.3.0
  - provisioning_api: 1.17.0
  - ransomware_protection: 1.14.0
  - settings: 1.9.0
  - survey_client: 1.15.0
  - tasks: 0.15.0
  - theming: 2.2.0
  - twofactor_backupcodes: 1.16.0
  - updatenotification: 1.17.0
  - viewer: 2.1.0
  - workflowengine: 2.9.0

Which apps does the NC AIO have installed, so I can try to replicate with bare metal clone? I can also setup an actual NC AIO and see whether the usage is on such multiple times higher level right from the start, and whether I can find out which app causes this.

Ah, one thing to test, as I am not sure how JIT buffer is counted into OPcache usage: Reduce it from 128 MiB to e.g. 4 MiB. The dietpi.com server with Wordpress and Matomo, previously additionally phpBB all on the same PHP-FPM instance and pool, and it never used more than 1 MiB for JIT, quite similar to your stats. So this is very small and it e.g. the max size is partly allocated from the overall OPcache usage, like interned strings buffer, this could explain the high usage, while large parts are simply empty.

Otherwise, since the interned strings buffer only really works with powers of 2 (does has strange meta/data space values otherwise), I would double both values to mitigate the issue, until we find out more.

szaimen commented 1 year ago

Which apps does the NC AIO have installed, so I can try to replicate with bare metal clone?

  - activity: 2.18.0
  - admin_audit: 1.16.0
  - calendar: 4.4.1
  - circles: 26.0.0
  - cloud_federation_api: 1.9.0
  - comments: 1.16.0
  - contacts: 5.3.0
  - contactsinteraction: 1.7.0
  - dashboard: 7.6.0
  - dav: 1.25.0
  - deck: 1.9.2
  - federatedfilesharing: 1.16.0
  - federation: 1.16.0
  - files: 1.21.1
  - files_pdfviewer: 2.7.0
  - files_rightclick: 1.5.0
  - files_sharing: 1.18.0
  - files_trashbin: 1.16.0
  - files_versions: 1.19.1
  - firstrunwizard: 2.15.0
  - logreader: 2.11.0
  - lookup_server_connector: 1.14.0
  - nextcloud-aio: 0.3.0
  - nextcloud_announcements: 1.15.0
  - notifications: 2.14.0
  - notify_push: 0.6.3
  - oauth2: 1.14.0
  - password_policy: 1.16.0
  - photos: 2.2.0
  - privacy: 1.10.0
  - provisioning_api: 1.16.0
  - recommendations: 1.5.0
  - related_resources: 1.1.0-alpha1
  - richdocuments: 8.0.2
  - serverinfo: 1.16.0
  - settings: 1.8.0
  - sharebymail: 1.16.0
  - spreed: 16.0.4
  - support: 1.9.0
  - survey_client: 1.14.0
  - systemtags: 1.16.0
  - tasks: 0.15.0
  - text: 3.7.2
  - theming: 2.1.1
  - twofactor_backupcodes: 1.15.0
  - twofactor_nextcloud_notification: 3.7.0
  - twofactor_totp: 8.0.0
  - user_status: 1.6.0
  - viewer: 1.10.0
  - weather_status: 1.6.0
  - workflowengine: 2.8.0

I can also setup an actual NC AIO and see whether the usage is on such multiple times higher level right from the start, and whether I can find out which app causes this.

Yes, feel free to :)

Ah, one thing to test, as I am not sure how JIT buffer is counted into OPcache usage: Reduce it from 128 MiB to e.g. 4 MiB. The dietpi.com server with Wordpress and Matomo, previously additionally phpBB all on the same PHP-FPM instance and pool, and it never used more than 1 MiB for JIT, quite similar to your stats. So this is very small and it e.g. the max size is partly allocated from the overall OPcache usage, like interned strings buffer, this could explain the high usage, while large parts are simply empty.

I see. I can do that.

Otherwise, since the interned strings buffer only really works with powers of 2 (does has strange meta/data space values otherwise), I would double both values to mitigate the issue, until we find out more.

I see. Lets try to decrease jit first then.

GUNT0815 commented 1 year ago

@GUNT0815 Are you punkyard in the Nextcloud forum or are two people having the same issue?

Thing is: It should be impossible for Nextcloud to fill 128 MiB of OPcache, even with all apps installed and 32 MiB internal strings buffer. Is there a way you can hack the container, put opcache-gui inside, access it via browser and check whether:

  1. OPcache size really effectively is 128 MiB
  2. It verifies >90% usage
  3. Only Nextcloud scripts are cached, or is there something else shopped with the AIO container?

No I do not access any container directly. I have a high number of jpgs on nextcloud and use recognize and memories.

This is what I get from the NextcloudAdminGuiStatistics:

ok 200 OK 26.0.2.1 none yes yes \OC\Memcache\APCu \OC\Memcache\Redis yes \OC\Memcache\Redis no 322491920384 1.93701171875 2.4091796875 2.2216796875 32638976 21672960 0 0 66 2 5.3.1 3.2.1 6 1405560 22 1 9 12 3 1 0 2 0 0 2 0 0 1 2 Apache/2.4.57 (Unix) 8.1.19 1073741824 7200 34359738368 1 103305712 30912016 0 0 25165360 6981424 18183936 107233 2774 5305 16229 58227976 1686730223 0 0 0 0 2776 0 0 99.995232759488 1 1 5 5 6 134217712 133132944 4099 0 9620689 7344 6690 1020 0 1686730223 688968 mmap 1 33554312 32797352 Core date libxml openssl pcre sqlite3 zlib ctype curl dom fileinfo filter ftp hash iconv json mbstring SPL session PDO pdo_sqlite bz2 posix readline Reflection standard SimpleXML tokenizer xml xmlreader xmlwriter mysqlnd cgi-fcgi apcu bcmath Phar exif gd gmp imagick imap intl ldap memcached pcntl pdo_pgsql pgsql redis smbclient sodium sysvsem zip libsmbclient Zend OPcache pgsql PostgreSQL 15.3 on x86_64-pc-linux-musl, compiled by gcc (Alpine 12.2.1_git20220924-r10) 12.2.1 20220924, 64-bit 2008068911 2 2 2

<ocs>
<meta>
<status>ok</status>
<statuscode>200</statuscode>
<message>OK</message>
...
</meta>
<data>
<nextcloud>
<system>
<version>26.0.2.1</version>
<theme>none</theme>
<enable_avatars>yes</enable_avatars>
<enable_previews>yes</enable_previews>
<memcache.local>\OC\Memcache\APCu</memcache.local>
<memcache.distributed>\OC\Memcache\Redis</memcache.distributed>
<filelocking.enabled>yes</filelocking.enabled>
<memcache.locking>\OC\Memcache\Redis</memcache.locking>
<debug>no</debug>
<freespace>322491920384</freespace>
<cpuload>
<element>1.93701171875</element>
<element>2.4091796875</element>
<element>2.2216796875</element>
...
</cpuload>
<mem_total>32638976</mem_total>
<mem_free>21672960</mem_free>
<swap_total>0</swap_total>
<swap_free>0</swap_free>
<apps>
<num_installed>66</num_installed>
<num_updates_available>2</num_updates_available>
<app_updates>
<contacts>5.3.1</contacts>
<mail>3.2.1</mail>
...
</app_updates>
...
</apps>
...
</system>
<storage>
<num_users>6</num_users>
<num_files>1405560</num_files>
<num_storages>22</num_storages>
<num_storages_local>1</num_storages_local>
<num_storages_home>9</num_storages_home>
<num_storages_other>12</num_storages_other>
...
</storage>
<shares>
<num_shares>3</num_shares>
<num_shares_user>1</num_shares_user>
<num_shares_groups>0</num_shares_groups>
<num_shares_link>2</num_shares_link>
<num_shares_mail>0</num_shares_mail>
<num_shares_room>0</num_shares_room>
<num_shares_link_no_password>2</num_shares_link_no_password>
<num_fed_shares_sent>0</num_fed_shares_sent>
<num_fed_shares_received>0</num_fed_shares_received>
<permissions_0_31>1</permissions_0_31>
<permissions_3_17>2</permissions_3_17>
...
</shares>
...
</nextcloud>
<server>
<webserver>Apache/2.4.57 (Unix)</webserver>
<php>
<version>8.1.19</version>
<memory_limit>1073741824</memory_limit>
<max_execution_time>7200</max_execution_time>
<upload_max_filesize>34359738368</upload_max_filesize>
<opcache>
<opcache_enabled>1</opcache_enabled>
<cache_full/>
<restart_pending/>
<restart_in_progress/>
<memory_usage>
<used_memory>103305712</used_memory>
<free_memory>30912016</free_memory>
<wasted_memory>0</wasted_memory>
<current_wasted_percentage>0</current_wasted_percentage>
...
</memory_usage>
<interned_strings_usage>
<buffer_size>25165360</buffer_size>
<used_memory>6981424</used_memory>
<free_memory>18183936</free_memory>
<number_of_strings>107233</number_of_strings>
...
</interned_strings_usage>
<opcache_statistics>
<num_cached_scripts>2774</num_cached_scripts>
<num_cached_keys>5305</num_cached_keys>
<max_cached_keys>16229</max_cached_keys>
<hits>58227976</hits>
<start_time>1686730223</start_time>
<last_restart_time>0</last_restart_time>
<oom_restarts>0</oom_restarts>
<hash_restarts>0</hash_restarts>
<manual_restarts>0</manual_restarts>
<misses>2776</misses>
<blacklist_misses>0</blacklist_misses>
<blacklist_miss_ratio>0</blacklist_miss_ratio>
<opcache_hit_rate>99.995232759488</opcache_hit_rate>
...
</opcache_statistics>
<jit>
<enabled>1</enabled>
<on>1</on>
<kind>5</kind>
<opt_level>5</opt_level>
<opt_flags>6</opt_flags>
<buffer_size>134217712</buffer_size>
<buffer_free>133132944</buffer_free>
...
</jit>
...
</opcache>
<apcu>
<cache>
<num_slots>4099</num_slots>
<ttl>0</ttl>
<num_hits>9620689</num_hits>
<num_misses>7344</num_misses>
<num_inserts>6690</num_inserts>
<num_entries>1020</num_entries>
<expunges>0</expunges>
<start_time>1686730223</start_time>
<mem_size>688968</mem_size>
<memory_type>mmap</memory_type>
...
</cache>
<sma>
<num_seg>1</num_seg>
<seg_size>33554312</seg_size>
<avail_mem>32797352</avail_mem>
...
</sma>
...
</apcu>
<extensions>
<element>Core</element>
<element>date</element>
<element>libxml</element>
<element>openssl</element>
<element>pcre</element>
<element>sqlite3</element>
<element>zlib</element>
<element>ctype</element>
<element>curl</element>
<element>dom</element>
<element>fileinfo</element>
<element>filter</element>
<element>ftp</element>
<element>hash</element>
<element>iconv</element>
<element>json</element>
<element>mbstring</element>
<element>SPL</element>
<element>session</element>
<element>PDO</element>
<element>pdo_sqlite</element>
<element>bz2</element>
<element>posix</element>
<element>readline</element>
<element>Reflection</element>
<element>standard</element>
<element>SimpleXML</element>
<element>tokenizer</element>
<element>xml</element>
<element>xmlreader</element>
<element>xmlwriter</element>
<element>mysqlnd</element>
<element>cgi-fcgi</element>
<element>apcu</element>
<element>bcmath</element>
<element>Phar</element>
<element>exif</element>
<element>gd</element>
<element>gmp</element>
<element>imagick</element>
<element>imap</element>
<element>intl</element>
<element>ldap</element>
<element>memcached</element>
<element>pcntl</element>
<element>pdo_pgsql</element>
<element>pgsql</element>
<element>redis</element>
<element>smbclient</element>
<element>sodium</element>
<element>sysvsem</element>
<element>zip</element>
<element>libsmbclient</element>
<element>Zend OPcache</element>
...
</extensions>
...
</php>
<database>
<type>pgsql</type>
<version>PostgreSQL 15.3 on x86_64-pc-linux-musl, compiled by gcc (Alpine 12.2.1_git20220924-r10) 12.2.1 20220924, 64-bit</version>
<size>2008068911</size>
...
</database>
...
</server>
<activeUsers>
<last5minutes>2</last5minutes>
<last1hour>2</last1hour>
<last24hours>2</last24hours>
...
</activeUsers>
...
</data>
...
</ocs>
MichaIng commented 1 year ago

I set up a test instance with the exact same apps enabled and JIT. Already before having all apps enabled, after having JIT enabled, while browsing through the Nextcloud UI, the OPcache usage went up step by step up to max 64 MiB (our installer limits it). Reducing the JIT buffer size to 4 MiB did not help, but disabling it did help.

So with:

With:

So I was able to bring the exact same 2350 files into the cache, but with JIT enabled, even with only 2 MiB size (only 1 MiB is shown as used anyway), the OPcache usage went up significantly.

I suspect the following:

@szaimen Reducing the JIT buffer size to 4 MiB does not hurt, but as of above it does not help either. As JIT is a great performance boost, it is probably better to instead increase OPcache memory usage (and interned strings buffer to mute its warning in all cases).

But for testing, probably you can generate a testing container image from a feature branch with JIT disabled, or disable it right within the container image via opcache.jit=off and see whether you can replicate my test results. If so, I would report/ask about this at the PHP bug tracker. When on it, I can ask about the constantly increasing interned strings buffer in same cases, how to get more info/debug it etc.

MichaIng commented 1 year ago

I also tested the 1255 JIT mode used by AIO compared to 1254 (=tracing/on/default), which differs in the "optimization level": https://www.php.net/manual/en/opcache.configuration.php#ini.opcache.jit

O (optimization level)

0: No JIT. 1: Minimal JIT (call standard VM handlers). 2: Inline VM handlers. 3: Use type inference. 4: Use call graph. 5: Optimize whole script.

"Use call graph" vs "Optimize whole script", but while it sounds suspicious for higher memory usage, it had no effect here. Result was exactly the same as with 1254.

punkyard commented 1 year ago

hi, yes @MichaIng, punkyard from NC forum

szaimen commented 1 year ago

@MichaIng thanks a lot for the in-detail analysis! I would then vote for increasing the opcache overall to double its current value and also increase interned_strings_buffer :)

szaimen commented 1 year ago

This is now released with v6.2.0 Beta. Testing and feedback is welcome! See https://github.com/nextcloud/all-in-one#how-to-switch-the-channel

elephantastyczny commented 1 year ago

Guys I've a pretty big site to verify, so if you'll tell me how to do those magic trick I can give you all the data needed...