nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.49k stars 4.08k forks source link

Change default memcached timeout values #14995

Open cromefire opened 5 years ago

cromefire commented 5 years ago

Steps to reproduce

  1. Have memcached configured
  2. (Have sentry configured, to catch the error, but the nextcloud.log also works)
  3. From time to time the error comes up

Expected behaviour

The error should be handled in anyway (retry, skip cache)

Actual behaviour

It seems to cause Internal server errors

Server configuration

Operating system: Linux

Web server: PHP-FPM/Apache

Database: MariaDB

PHP version: 7.2

Nextcloud version: 15.0.6 (but also earlier versions)

Updated from an older Nextcloud/ownCloud or fresh install: Fresh

Where did you install Nextcloud from: Tarball

Signing status:

Signing status ``` No errors have been found. ```

List of activated apps:

App list ``` - accessibility: 1.1.0 - activity: 2.8.2 - admin_audit: 1.5.0 - apporder: 0.6.0 - bruteforcesettings: 1.3.0 - checksum: 0.4.2 - cloud_federation_api: 0.1.0 - comments: 1.5.0 - contacts: 3.0.5 - dashboard: 6.0.0 - dav: 1.8.1 - deck: 0.5.2 - encryption: 2.3.0 - federatedfilesharing: 1.5.0 - federation: 1.5.0 - files: 1.10.0 - files_external: 1.6.0 - files_fulltextsearch: 1.2.6 - files_markdown: 2.0.6 - files_mindmap: 0.0.12 - files_pdfviewer: 1.4.0 - files_sharing: 1.7.0 - files_texteditor: 2.7.0 - files_trashbin: 1.5.0 - files_versions: 1.8.0 - files_videoplayer: 1.4.0 - firstrunwizard: 2.4.0 - fulltextsearch: 1.2.5 - fulltextsearch_elasticsearch: 1.2.4 - gallery: 18.2.0 - gpxedit: 0.0.10 - jsloader: 1.0.2 - logreader: 2.0.0 - lookup_server_connector: 1.3.0 - mail: 0.11.0 - music: 0.9.4 - news: 13.1.5 - nextcloud_announcements: 1.4.0 - notes: 2.5.1 - notifications: 2.3.0 - oauth2: 1.3.0 - password_policy: 1.5.0 - phonetrack: 0.5.0 - polls: 0.10.2 - previewgenerator: 2.1.0 - provisioning_api: 1.5.0 - quota_warning: 1.4.0 - ransomware_protection: 1.3.0 - richdocuments: 3.2.4 - sentry: 4.0.0 - serverinfo: 1.5.0 - sharebymail: 1.5.0 - spreed: 5.0.2 - support: 1.0.0 - survey_client: 1.3.0 - systemtags: 1.5.0 - tasks: 0.9.8 - theming: 1.6.0 - theming_customcss: 1.2.0 - twofactor_backupcodes: 1.4.1 - unsplash: 1.1.3 - updatenotification: 1.5.0 - workflowengine: 1.5.0 ```

Nextcloud configuration:

Config report ``` { "system": { "instanceid": "***REMOVED SENSITIVE VALUE***", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "***REMOVED SENSITIVE VALUE***", "***REMOVED SENSITIVE VALUE***", "***REMOVED SENSITIVE VALUE***" ], "datadirectory": "***REMOVED SENSITIVE VALUE***", "dbtype": "mysql", "version": "15.0.6.1", "overwrite.cli.url": "***REMOVED SENSITIVE VALUE***", "htaccess.RewriteBase": "\/", "dbname": "***REMOVED SENSITIVE VALUE***", "dbhost": "***REMOVED SENSITIVE VALUE***", "dbport": "3306", "dbtableprefix": "oc_", "mysql.utf8mb4": true, "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "installed": true, "twofactor_enforced": "false", "twofactor_enforced_groups": [], "twofactor_enforced_excluded_groups": [], "memcache.local": "\\OC\\Memcache\\Memcached", "memcache.distributed": "\\OC\\Memcache\\Memcached", "memcached_servers": [ [ "localhost", 11211 ] ], "filelocking.enabled": true, "memcache.locking": "\\OC\\Memcache\\Redis", "redis": { "host": "***REMOVED SENSITIVE VALUE***", "port": 0, "dbindex": 1, "timeout": 30, "password": "***REMOVED SENSITIVE VALUE***" }, "mail_from_address": "***REMOVED SENSITIVE VALUE***", "mail_smtpmode": "smtp", "mail_sendmailmode": "smtp", "mail_smtpsecure": "ssl", "mail_domain": "***REMOVED SENSITIVE VALUE***", "mail_smtpauthtype": "LOGIN", "mail_smtpauth": 1, "mail_smtphost": "***REMOVED SENSITIVE VALUE***", "mail_smtpport": "465", "mail_smtpname": "***REMOVED SENSITIVE VALUE***", "mail_smtppassword": "***REMOVED SENSITIVE VALUE***", "sentry.dsn": "***REMOVED SENSITIVE VALUE***", "sentry.public-dsn": "***REMOVED SENSITIVE VALUE***", "updater.release.channel": "production", "maintenance": false, "theme": "", "loglevel": 2 } } ```

Are you using external storage, if yes which one: No

Are you using encryption: yes

Are you using an external user-backend, if yes which one: No

Client configuration

Browser: N/A

Operating system: N/A

Logs

Web server error log

Web server error log FPM: ``` [06-Apr-2019 16:59:11] NOTICE: exiting, bye-bye! [06-Apr-2019 16:59:17] NOTICE: fpm is running, pid 18350 [06-Apr-2019 16:59:17] NOTICE: ready to handle connections [06-Apr-2019 16:59:17] NOTICE: systemd monitor interval set to 10000ms [06-Apr-2019 17:10:57] NOTICE: Terminating ... [06-Apr-2019 17:10:57] NOTICE: exiting, bye-bye! [06-Apr-2019 17:10:58] NOTICE: fpm is running, pid 22862 [06-Apr-2019 17:10:58] NOTICE: ready to handle connections [06-Apr-2019 17:10:58] NOTICE: systemd monitor interval set to 10000ms [06-Apr-2019 17:27:48] NOTICE: Terminating ... [06-Apr-2019 17:27:48] NOTICE: exiting, bye-bye! [06-Apr-2019 17:27:48] NOTICE: fpm is running, pid 29565 [06-Apr-2019 17:27:48] NOTICE: ready to handle connections [06-Apr-2019 17:27:48] NOTICE: systemd monitor interval set to 10000ms [06-Apr-2019 17:28:45] NOTICE: Terminating ... [06-Apr-2019 17:28:45] NOTICE: exiting, bye-bye! [06-Apr-2019 17:28:45] NOTICE: fpm is running, pid 29885 [06-Apr-2019 17:28:46] NOTICE: ready to handle connections [06-Apr-2019 17:28:46] NOTICE: systemd monitor interval set to 10000ms [06-Apr-2019 17:29:18] NOTICE: Terminating ... [06-Apr-2019 17:29:18] NOTICE: exiting, bye-bye! [06-Apr-2019 17:29:18] NOTICE: fpm is running, pid 30103 [06-Apr-2019 17:29:19] NOTICE: ready to handle connections [06-Apr-2019 17:29:19] NOTICE: systemd monitor interval set to 10000ms [07-Apr-2019 13:15:47] NOTICE: [pool www] child 30128 exited with code 0 after 35194.500522 seconds from start [07-Apr-2019 13:15:47] NOTICE: [pool www] child 19166 started [08-Apr-2019 10:34:12] NOTICE: [pool www] child 30140 exited with code 0 after 81310.933891 seconds from start [08-Apr-2019 10:34:12] NOTICE: [pool www] child 8065 started ``` Apache: _Nothing in that timeframe_

Nextcloud log (data/nextcloud.log)

Nextcloud log [next.log](https://github.com/nextcloud/server/files/3054478/next.log) (GitHub issues with pasting)

Browser log

Browser log N/A

Additional data from Sentry

Stacktrace ``` Exception: Error 31 interacting with memcached : A TIMEOUT OCCURRED #17 /var/www/nextcloud/lib/private/Memcache/Memcached.php(225): verifyReturnCode #16 /var/www/nextcloud/lib/private/Memcache/Memcached.php(117): set #15 /var/www/nextcloud/lib/private/Template/SCSSCacher.php(312): cache #14 /var/www/nextcloud/lib/private/Template/SCSSCacher.php(153): process #13 /var/www/nextcloud/lib/private/Template/CSSResourceLocator.php(109): cacheAndAppendScssIfExist #12 /var/www/nextcloud/lib/private/Template/CSSResourceLocator.php(84): doFind #11 /var/www/nextcloud/lib/private/Template/ResourceLocator.php(78): find #10 /var/www/nextcloud/lib/private/TemplateLayout.php(292): findStylesheetFiles #9 /var/www/nextcloud/lib/private/TemplateLayout.php(201): __construct #8 /var/www/nextcloud/lib/private/legacy/template.php(211): fetchPage #7 /var/www/nextcloud/lib/public/AppFramework/Http/TemplateResponse.php(157): render #6 /var/www/nextcloud/lib/private/AppFramework/Http/Dispatcher.php(119): dispatch #5 /var/www/nextcloud/lib/private/AppFramework/App.php(118): main #4 /var/www/nextcloud/lib/private/AppFramework/Routing/RouteActionHandler.php(47): __invoke #3 /var/www/nextcloud/lib/private/AppFramework/Routing/RouteActionHandler.php(0): call_user_func #2 /var/www/nextcloud/lib/private/Route/Router.php(297): match #1 /var/www/nextcloud/lib/base.php(987): handleRequest #0 /var/www/nextcloud/index.php(42): null ```
Breadcrumbs ![Breadcrumbs](https://user-images.githubusercontent.com/26320625/55727408-3ebd1100-5a12-11e9-99ee-5ead45863320.png)

On request I can privately share the sentry issue

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity and it seems to be missing some essential informations. It will be closed if no further activity occurs. Thank you for your contributions.

cromefire commented 5 years ago

This still occurs

kesselb commented 5 years ago

Can confirm from looking at the code. According to the interface https://github.com/nextcloud/server/blob/ba87db3fccb40aa58d84d12932424c83a4db411f/lib/public/ICache.php#L50-L58

set should return true or false but the memcache adapter may return an exception if something went wrong. Redis adapter may throw a RedisException if the host is not reachable.

For your case https://github.com/nextcloud/server/blob/e8938df1986c4b52fc1f5c9afe4cc4078ea86e4c/lib/private/Template/SCSSCacher.php#L307-L321

we could use \Exception instead of NotPermittedException and catch any error. This would work for RedisException as well but the same problem applies for other places where ICache is used.

It might be handy to remove the nextcloud cache stuff and use something like https://symfony.com/doc/current/cache.html.

cc @MorrisJobke

MorrisJobke commented 5 years ago

It might be handy to remove the nextcloud cache stuff and use something like https://symfony.com/doc/current/cache.html.

Hard to say - we still need to maintain a stable API and then it gets tricky (see the query builder we forward from doctrine 🙈)

cc @rullzer @ChristophWurst

karagian commented 4 years ago

I have seen the same error on similar installation with memcached. The error is pretty serious and happens quite often especially when the server is under load. Searched through the sources and what I found was that the memcached php module is initialized with these values:

'memcached_options' => array(
    // Set timeouts to 50ms
    \Memcached::OPT_CONNECT_TIMEOUT => 50,
    \Memcached::OPT_RETRY_TIMEOUT =>   50,
    \Memcached::OPT_SEND_TIMEOUT =>    50,
    \Memcached::OPT_RECV_TIMEOUT =>    50,
    \Memcached::OPT_POLL_TIMEOUT =>    50,

The odd thing is that although all values seem the same, they are actually in different units: https://www.php.net/manual/en/memcached.constants.php OPT_CONNECT_TIMEOUT and OPT_RETRY_TIMEOUT are in milliseconds, while OPT_SEND_TIMEOUT and OPT_RECV_TIMEOUT are in microseconds! The value of 50 microseconds that is set by default seems extremely small for an average system, and probably this is why this is happening...

It seems that is it possible to modify these values from config.php. I will try changing them from there, see how it goes and report back, but the default timeouts are quite extreme, if not wrong, and probably the cause of this.

kesselb commented 4 years ago

OPT_CONNECT_TIMEOUT and OPT_RETRY_TIMEOUT are in milliseconds, while OPT_SEND_TIMEOUT and OPT_RECV_TIMEOUT are in microseconds!

:worried: It's even more worse. OPT_RETRY_TIMEOUT is in seconds ;)

phillwuk commented 4 years ago

Hi, I'm still seeing this error. Did you get anywhere with altering the default settings?

solracsf commented 2 years ago

Still an issue, here upgrade from 22.2.9 to 22.2.10 on the CLI:

Update of code successful.

Should the "occ upgrade" command be executed? [Y/n] y
An unhandled exception has been thrown:
Exception: Error 31 interacting with memcached : A TIMEOUT OCCURRED in /lib/private/Memcache/Memcached.php:207
Stack trace:
#0 /lib/private/Memcache/Memcached.php(118): OC\Memcache\Memcached->verifyReturnCode()
#1 /lib/private/App/InfoParser.php(218): OC\Memcache\Memcached->set()
#2 /lib/private/App/AppManager.php(506): OC\App\InfoParser->parse()
#3 /lib/private/legacy/OC_App.php(593): OC\App\AppManager->getAppInfo()
#4 /lib/private/AppFramework/App.php(69): OC_App::getAppInfo()
#5 /lib/private/legacy/OC_App.php(278): OC\AppFramework\App::buildAppNamespace()
#6 /lib/private/AppFramework/Bootstrap/Coordinator.php(108): OC_App::registerAutoloading()
#7 /lib/private/AppFramework/Bootstrap/Coordinator.php(82): OC\AppFramework\Bootstrap\Coordinator->registerApps()
#8 /lib/base.php(640): OC\AppFramework\Bootstrap\Coordinator->runInitialRegistration()
#9 /lib/base.php(1091): OC::init()
#10 /console.php(48): require_once('...')
#11 /occ(11): require_once('...')
#12 {main}
szaimen commented 1 year ago

Hi, please update to 24.0.8 or better 25.0.2 and report back if it fixes the issue. Thank you!

kesselb commented 1 year ago

SCSSCacher is gone (since Nextcloud 25) :partying_face:

But the timeouts for Memcache are probably still outdated.