nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.25k stars 4.05k forks source link

[Bug]: Expected filesize of X bytes but read (from Nextcloud client) and wrote (to Nextcloud storage) 0 bytes #37695

Closed mgutt closed 1 year ago

mgutt commented 1 year ago

⚠️ This issue respects the following points: ⚠️

Bug description

Some of my users randomly face the issue that uploads are interrupted. Finally I found out that the user's browser does not respect the Chunking size (10 MiB default) and some or all uploads (if multiple) hit Apache's LimitRequestBody limit, which seems to be set by default to 1024MB in the Nextcloud Docker Container.

I found some other users complaining about this issue and solution: https://help.nextcloud.com/t/experiencing-unwanted-1gb-upload-hard-limit-unable-to-find-the-culprit/157243/10

Steps to reproduce

Can't be reliable reproduced as it happens randomly.

Expected behavior

Upload should work in any case.

Installation method

Community Docker image

Nextcloud Server version

25

Operating system

Other

PHP engine version

None

Web server

Apache (supported)

Database engine version

MariaDB

Is this bug present after an update or on a fresh install?

None

Are you using the Nextcloud Server Encryption module?

None

What user-backends are you using?

Configuration report

No response

List of activated Apps

irrelevant

Nextcloud Signing status

No response

Nextcloud Logs

{"reqId":"5vjNDqDq1wv3mMHm22pv","level":3,"time":"2023-04-11T19:11:38+00:00","remoteAddr":"xx.xx.xx.xx","user":"xxx","app":"no app in context","method":"PUT","url":"/remote.php/webdav/xxx/xxx","message":"Erwartete Dateigr\u00f6\u00dfe von 2906848234 bytes, aber 0 bytes gelesen (vom Nextcloud-Client) und geschrieben (in den Nextcloud-Speicher). Dies kann entweder ein Netzwerkproblem auf der sendenden Seite oder ein Problem beim Schreiben in den Speicher auf der Serverseite sein.","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.4.1 Safari/605.1.15","version":"25.0.5.1","exception":{"Exception":"Sabre\\DAV\\Exception\\BadRequest","Message":"Erwartete Dateigr\u00f6\u00dfe von 2906848234 bytes, aber 0 bytes gelesen (vom Nextcloud-Client) und geschrieben (in den Nextcloud-Speicher). Dies kann entweder ein Netzwerkproblem auf der sendenden Seite oder ein Problem beim Schreiben in den Speicher auf der Serverseite sein.","Code":0,"Trace":[{"file":"/var/www/html/apps/dav/lib/Connector/Sabre/Directory.php","line":151,"function":"put","class":"OCA\\DAV\\Connector\\Sabre\\File","type":"->","args":[null]},{"file":"/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php","line":1098,"function":"createFile","class":"OCA\\DAV\\Connector\\Sabre\\Directory","type":"->","args":["s1e06.mkv",null]},{"file":"/var/www/html/3rdparty/sabre/dav/lib/DAV/CorePlugin.php","line":504,"function":"createFile","class":"Sabre\\DAV\\Server","type":"->","args":["TV/Batman (1966)/s1e06.mkv",null,null]},{"file":"/var/www/html/3rdparty/sabre/event/lib/WildcardEmitterTrait.php","line":89,"function":"httpPut","class":"Sabre\\DAV\\CorePlugin","type":"->","args":[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]},{"file":"/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php","line":472,"function":"emit","class":"Sabre\\DAV\\Server","type":"->","args":["method:PUT",[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]]},{"file":"/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php","line":253,"function":"invokeMethod","class":"Sabre\\DAV\\Server","type":"->","args":[["Sabre\\HTTP\\Request"],["Sabre\\HTTP\\Response"]]},{"file":"/var/www/html/3rdparty/sabre/dav/lib/DAV/Server.php","line":321,"function":"start","class":"Sabre\\DAV\\Server","type":"->","args":[]},{"file":"/var/www/html/apps/dav/appinfo/v1/webdav.php","line":85,"function":"exec","class":"Sabre\\DAV\\Server","type":"->","args":[]},{"file":"/var/www/html/remote.php","line":171,"args":["/var/www/html/apps/dav/appinfo/v1/webdav.php"],"function":"require_once"}],"File":"/var/www/html/apps/dav/lib/Connector/Sabre/File.php","Line":297,"message":"Erwartete Dateigr\u00f6\u00dfe von 2906848234 bytes, aber 0 bytes gelesen (vom Nextcloud-Client) und geschrieben (in den Nextcloud-Speicher). Dies kann entweder ein Netzwerkproblem auf der sendenden Seite oder ein Problem beim Schreiben in den Speicher auf der Serverseite sein.","exception":{},"CustomMessage":"Erwartete Dateigr\u00f6\u00dfe von 2906848234 bytes, aber 0 bytes gelesen (vom Nextcloud-Client) und geschrieben (in den Nextcloud-Speicher). Dies kann entweder ein Netzwerkproblem auf der sendenden Seite oder ein Problem beim Schreiben in den Speicher auf der Serverseite sein."}}

Additional info

No response

szaimen commented 1 year ago

Hi, see https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/big_file_upload_configuration.html

mgutt commented 1 year ago

Hi, see https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/big_file_upload_configuration.html

I know that article, but finally it does not contain any explanation regarding the LimitRequestBody size, why it could be relevant and how to change it. To be exact: It is not possible to change it as every container upgrade overwrites the .htaccess file (that's why I use a cronjob to check and add the setting if its missing).

In addition the article describes solutions which shouldn't be needed at all as Nextcloud's chunking bypasses all those limits. With the default settings (10 MB chunking, 512MB PHP_UPLOAD_LIMIT/PHP_MEMORY_LIMIT) it was no problem to upload huge files to my servers for 99% of my users. Only a tiny amount of them had the problem that their browser randomly fail to chunk the files and by that those tiny amount hits the PHP, Apache and Proxy Server limits. So we try to create a workaround for random situations, which I think is ok as finally the user experience is the most important thing.

I think we have to solutions to solve this: A) Add LimitRequestBody 0 as the new default B) Use the container's PHP_UPLOAD_LIMIT value and set the LimitRequestBody accordingly (in bytes of course)

Reinitialized commented 1 year ago

Hey mgutt/time travelers in search for answers,

It looks like this is an issue of undocumented functionality when it comes to the Docker image, probably because it is community maintained. If you are using Docker Compose (I sure hope so if you're using Docker/Docker Swarm Mode!), you can tweak these settings by setting them as environment variables like so:

environment:
  - PHP_UPLOAD_LIMIT=16G
  - PHP_MEMORY_LIMIT=16G
  - POST_MAX_SIZE=16G
  - MAX_INPUT_TIME=3600
  - MAX_EXECUTION_TIME=3600

Using the available documentation, this is what I set mine and the issue has so far gone away with fast and solid uploads. It should be noted the first two variables are not documented at all and were found through Reddit. I am going to look to see if there has been any attempts to get the documentation to provide this information, and if not make an edit to make note of it.

mgutt commented 1 year ago

you can tweak these settings by setting them as environment variables like so:

I already use PHP_UPLOAD_LIMIT as mentioned in my last comment. But this does not set LimitRequestBody. Thats why I'm using a script on my host, which overwrites the .htaccess file of Nextcloud:

# add custom php settings
htaccess_file="/mnt/cache/appdata/nextcloud/html/.htaccess"
if [[ -f "$htaccess_file" ]]; then
  if ! grep "LimitRequestBody" "$htaccess_file" &>/dev/null; then
    echo "
# Custom Apache Settings
LimitRequestBody 0
" >> "$htaccess_file"
    echo ".htaccess updated."
  else
    echo ".htaccess already up-to-date."
  fi
else
  echo "Error: .htaccess not found."
fi

PS You don't need to set PHP_MEMORY_LIMIT to 16G. This has nothing to do with uploads, but having a bigger value like 2G helps creating thumbnails of huge photo files.

RphCos commented 1 year ago

Thank you for the solution about editing the .htaccess file.

I can confirm it solved my problem on NC 26.0.2 PHP 8.0, and like you, all the steps from https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/big_file_upload_configuration.html were already done since my installation is not fresh at all and the problem suddenly appeared.

If someone stumble upons this issue you can check your "big uploads settings" in the System vue of the Administration panel, in the PHP at the bottom of the page. If it's all set then it's the .htaccess.

I have to mention I am NOT using nextcloud docker, but a classical installation.

Again thank you.

I must add that after this issue I ran into another issue where the size is deemed inconsistent by nextcloud, and I'm still to this day looking for a fix.

bugsyb commented 2 months ago

Just to add to help others, who run Nextcloud behind Traefik v3, as I've spent on it good couple of nights across number of months. The weirdest part was that somehow it kept working well when using Apache and not nginx with php-fpm. At that time had storage separated from the app and effectively Apache version was running on Arm64 (lower powered unit) and nginx/php-fpm - failing one on high perf unit. Didn't make sense, but hey, ho - not everything makes sense in this world. On top of that, I'd bet that a year or more earlier it just all worked fine. In the meantime I've went through number of components upgrades, so hard to pin it down to specific one. And was running out of time due to day job requirements.

This was pointing down to issue with this combo and been searching around. Ended up decreasing chunk side to unreasonably small value like 20MB or so. In the meantime moved all to even stronger unit and was happy with nginx/php-fpm. Only to be hit with it one day again with file size of 12MB and failing anywhere between 4-7MB. This was way above any acceptance level. This was whilst been traveling and when really needed to have sync working, just in case.

With long intro... here comes the root cause.

With fresh head started to look at that again and found out that max upload varies significantly, looking at traffic, it was slow upload and that point to the time it takes to upload. All settings on nginx/php-fpm/nextcloud side were maxed out (16GB upload size, 4GB memory and times counted in days).

That meant it's somewhere else - fresh review of what it could be, as all was apparently set in a way it should not happen including buffering on Traefik has been disabled brought it back to reviewing basic settings and logs, this time on traefic again. With lower traffic more was visible an bang... entries like this:

 "PUT /subfolder/remote.php/dav/uploads/luser/1036068400/00001 HTTP/1.1" 499 21 "-" "-" 215 "nextcloud-rtr@docker" "http://1.2.3.4:80" 10000ms

We're home baby...

Checking that took just another 60s to find out that Traefik v3 introduced new defaults: readTimeout=60s (from disabled = 0).

One needs to set it either at instance level or globally, i.e. :

# disable readTimeout - bring back old defaults
--entryPoints.web.transport.respondingTimeouts.readTimeout=0s

There is also idleTimeout = 180s by default and writeTimeout=0.

The key though in this case for uploads is readTimeout.

Good luck and hope I've saved you some of valuable time.

marco-calautti commented 2 months ago

--entryPoints.web.transport.respondingTimeouts.readTimeout=0s

This indeed is the solution. I was completely unable to upload large files using Firefox, on my nextcloud instance behind traefik. Disabling the readTimeout made everything work as expected!

Thanks a lot!

ayoahha commented 5 days ago

Just to add to help others, who run Nextcloud behind Traefik v3, as I've spent on it good couple of nights across number of months. The weirdest part was that somehow it kept working well when using Apache and not nginx with php-fpm. At that time had storage separated from the app and effectively Apache version was running on Arm64 (lower powered unit) and nginx/php-fpm - failing one on high perf unit. Didn't make sense, but hey, ho - not everything makes sense in this world. On top of that, I'd bet that a year or more earlier it just all worked fine. In the meantime I've went through number of components upgrades, so hard to pin it down to specific one. And was running out of time due to day job requirements.

This was pointing down to issue with this combo and been searching around. Ended up decreasing chunk side to unreasonably small value like 20MB or so. In the meantime moved all to even stronger unit and was happy with nginx/php-fpm. Only to be hit with it one day again with file size of 12MB and failing anywhere between 4-7MB. This was way above any acceptance level. This was whilst been traveling and when really needed to have sync working, just in case.

With long intro... here comes the root cause.

With fresh head started to look at that again and found out that max upload varies significantly, looking at traffic, it was slow upload and that point to the time it takes to upload. All settings on nginx/php-fpm/nextcloud side were maxed out (16GB upload size, 4GB memory and times counted in days).

That meant it's somewhere else - fresh review of what it could be, as all was apparently set in a way it should not happen including buffering on Traefik has been disabled brought it back to reviewing basic settings and logs, this time on traefic again. With lower traffic more was visible an bang... entries like this:

 "PUT /subfolder/remote.php/dav/uploads/luser/1036068400/00001 HTTP/1.1" 499 21 "-" "-" 215 "nextcloud-rtr@docker" "http://1.2.3.4:80" 10000ms

We're home baby...

Checking that took just another 60s to find out that Traefik v3 introduced new defaults: readTimeout=60s (from disabled = 0).

One needs to set it either at instance level or globally, i.e. :

# disable readTimeout - bring back old defaults
--entryPoints.web.transport.respondingTimeouts.readTimeout=0s

There is also idleTimeout = 180s by default and writeTimeout=0.

The key though in this case for uploads is readTimeout.

Good luck and hope I've saved you some of valuable time.

Hi @bugsyb Thanks for this (maybe) solution!

i am just wondering if instead of puting a label, we can add the same directive in a Middleware or other conf file ? Cause if i apply this label on my traefik, it will be for ALL requests for ALL app (not only nextcloud), thing i do not really want ! Do you have examples ?