nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
26.67k stars 4k forks source link

[Bug]: Opening a document on Android crashes the server (CODE-related?) #44821

Open HammyHavoc opened 5 months ago

HammyHavoc commented 5 months ago

⚠️ This issue respects the following points: ⚠️

Bug description

Opening a document on the Android Nextcloud app crashes the server, and so I have to restart the Docker container. I'm wondering if it has something to do with CODE, but Collabora itself doesn't need restarting.

Steps to reproduce

  1. Open a document on the Android Nextcloud app

Expected behavior

That it doesn't crash the Nextcloud server.

Installation method

Community Docker image

Nextcloud Server version

27

Operating system

Other

PHP engine version

Other

Web server

Nginx

Database engine version

MariaDB

Is this bug present after an update or on a fresh install?

Updated from a MINOR version (ex. 22.1 to 22.2)

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

Configuration report

No response

List of activated Apps

No response

Nextcloud Signing status

No response

Nextcloud Logs

No response

Additional info

No response

joshtrichards commented 5 months ago

Your report lacks enough information to be actionable.

What precisely do you mean by "crashes the server"?

Also:

Log entries?

HammyHavoc commented 5 months ago

Yes, I realize this, it comes down to time and practicality. I don't particularly want to intentionally bring down the production server on a weekday by crashing it. Logs will follow when I do it again, as I can't keep running parity checks mid-week as it kills performance and causes downtime, though there was nothing illuminating in the log as it stops responding entirely without any kind of error.

It causes the Nextcloud container to stop responding entirely, even attempting to kill the container process doesn't work, and the only way to get it to stop and start again is by rebooting the Unraid server itself, which gives a "dirty" off and parity check due to the process not being stopped prior to restart.

Running NC 28.0.4.

Enabled:
 - activity: 2.20.0
 - admin_audit: 1.18.0
 - analytics: 4.12.0
 - app_api: 2.4.0
 - assistant: 1.0.8
 - bookmarks: 13.1.3
 - bruteforcesettings: 2.8.0
 - circles: 28.0.0
 - cloud_federation_api: 1.11.0
 - collectives: 2.10.1
 - comments: 1.18.0
 - contacts: 5.5.3
 - contactsinteraction: 1.9.0
 - context_chat: 2.1.0
 - cookbook: 0.11.0
 - dashboard: 7.8.0
 - dav: 1.29.1
 - deck: 1.12.2
 - event_update_notification: 2.3.0
 - external: 5.3.1
 - federatedfilesharing: 1.18.0
 - federation: 1.18.0
 - files: 2.0.0
 - files_automatedtagging: 1.18.0
 - files_external: 1.20.0
 - files_pdfviewer: 2.9.0
 - files_reminders: 1.1.0
 - files_sharing: 1.20.0
 - files_trashbin: 1.18.0
 - files_versions: 1.21.0
 - firstrunwizard: 2.17.0
 - group_everyone: 0.1.15
 - groupfolders: 16.0.6
 - guests: 3.0.1
 - impersonate: 1.15.0
 - integration_dropbox: 2.2.0
 - integration_github: 2.0.7
 - integration_openproject: 2.6.2
 - logreader: 2.13.0
 - lookup_server_connector: 1.16.0
 - nextcloud_announcements: 1.17.0
 - notes: 4.9.4
 - notifications: 2.16.0
 - oauth2: 1.16.3
 - password_policy: 1.18.0
 - photos: 2.4.0
 - previewgenerator: 5.5.0
 - privacy: 1.12.0
 - provisioning_api: 1.18.0
 - quota_warning: 1.19.0
 - recommendations: 2.0.0
 - related_resources: 1.3.0
 - richdocuments: 8.3.4
 - serverinfo: 1.18.0
 - settings: 1.10.1
 - sharebymail: 1.18.0
 - socialsharing_email: 3.1.0
 - socialsharing_facebook: 3.1.0
 - socialsharing_twitter: 3.1.0
 - support: 1.11.1
 - survey_client: 1.16.0
 - suspicious_login: 6.0.0
 - systemtags: 1.18.0
 - tasks: 0.15.0
 - text: 3.9.1
 - theming: 2.3.0
 - twofactor_backupcodes: 1.17.0
 - twofactor_nextcloud_notification: 3.9.0
 - twofactor_totp: 10.0.0-beta.2
 - twofactor_webauthn: 1.4.0
 - user_ldap: 1.19.0
 - user_status: 1.8.1
 - viewer: 2.2.0
 - workflowengine: 2.10.0
Disabled:
 - apporder: 0.15.0
 - encryption: 2.9.0
 - extract: 1.3.6
 - files_3d: 0.5.0
 - files_downloadactivity: 1.16.0
 - files_retention: 1.17.1
 - files_rightclick: 1.6.0
 - files_trackdownloads: 1.11.0
 - flow_notifications: 1.8.0
 - forms: 4.1.1
 - integration_onedrive: 3.2.0
 - integration_reddit: 2.0.3
 - integration_twitter: 1.0.7
 - mail: 3.5.7
 - maps: 1.3.1
 - mediadc: 0.3.8
 - memories: 7.2.0
 - metadata: 0.19.0
 - news: 24.0.0
 - phonetrack: 0.7.7
 - sharingpath: 0.4.4
 - talk_matterbridge: 1.26.0
 - updatenotification: 1.17.0
 - weather_status: 1.1.0
 - workflow_ocr: 1.28.0
 - workflow_pdf_converter: 1.13.0
 - workflow_script: 1.13.0
joshtrichards commented 5 months ago

Logs will follow when I do it again, as I can't keep running parity checks mid-week as it kills performance and causes downtime, though there was nothing illuminating in the log as it stops responding entirely without any kind of error.

It causes the Nextcloud container to stop responding entirely, even attempting to kill the container process doesn't work, and the only way to get it to stop and start again is by rebooting the Unraid server itself, which gives a "dirty" off and parity check due to the process not being stopped prior to restart.

I guess it could be something something in the micro-services community Docker image (if that's the one you're using; that's unclear). Or maybe the Nextcloud Office integration app (richdocuments). I use both and haven't noticed anything like this nor have I seen similar reports come in.

But this really sounds like something is not right with your Docker environment. Maybe check your Docker Engine version (docker version and https://docs.docker.com/engine/release-notes/26.0/ for anything relevant).

I'm wondering if it has something to do with CODE, but Collabora itself doesn't need restarting.

What's the load like when this happens? Disk I/O? Ram usage? Does the container/host seem to be doing anything at all? Your host system logs (journalctl and/or /var/log/*) may have some clues if you're unable to even kill the processes.

Opening a document on the Android Nextcloud app crashes the server

So it's specific to the Android? It doesn't happen when open a document within Nextcloud Server via the Web UI?

I suggest posting over at the Community Help Forum - https://help.nextcloud.com. If this is happening people would/will be pretty vocal about it (understandably). I haven't seen anything like that off-hand, but maybe it'll turn something up.

HammyHavoc commented 5 months ago

It also appears to happen to people on iOS: https://forums.unraid.net/topic/90003-cant-stop-docker-container-nextcloud-or-reboot-the-whole-system/page/2/?_fromLogin=1

Unraid uses Docker engine 25.0.2 at present. Nothing else seems to give any issues, but it's consistently reproducible. I just intentionally crashed it after midnight, and yes, same problem, but nothing in the logs even on the host to indicate there's a problem. Nothing abnormal going on in terms of resource consumption, and other containers are uninterrupted. Even if I shut down every other container, it still won't let me stop the Nextcloud container, and I can't kill the process manually. Again, nothing in the logs for Nextcloud, nor Unraid itself, things carry on ticking as you'd expect elsewhere, except Nextcloud can't be stopped or killed. System isn't under much stress (averaging ~11% CPU use, and using 4GB out of 32GB in terms of RAM, 2TB of storage free, cache pool has 500GB free).

The Nextcloud container itself once crashed is at 0% CPU and 387.4MB of RAM used, but it doesn't change and quite literally ceases to function in any discernible capacity.

I've posted about it on the forum over here: https://help.nextcloud.com/t/is-accessing-files-via-the-mobile-app-causing-your-docker-container-to-crash/188221

It hasn't always been a problem, only in the past few months. I was considering spinning up the config on a box running Portainer and checking to see if I can replicate the issue, but won't be tonight as it's almost 1am and still need to stop all the other containers, reboot the server, restart all the backup routines I've interrupted etc.

Tresillo2017 commented 5 months ago

Happens to me on nextcloud aio and nextcloud standalone. I can't restart the container and need to restart the server which looks a long time due to be waiting for process php-fpm to be killed

nextcloud-command commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.

joshtrichards commented 3 months ago

@Tresillo2017 Are you also running those containers on Unraid like the original reporter?

Tresillo2017 commented 3 months ago

@Tresillo2017 Are you also running those containers on Unraid like the original reporter?

No, it's on ubuntu server