plone / Products.CMFPlone

The core of the Plone content management system
https://plone.org
GNU General Public License v2.0
254 stars 190 forks source link

Unable to upload files larger than 1mb after Plone 6.0.7 upgrade #3848

Open cekk opened 1 year ago

cekk commented 1 year ago

BUG/PROBLEM REPORT (OR OTHER COMMON ISSUE)

This is probably related to Zope 5.8.4 issue #1141 and pull request #1142.

What I did:

Update buildout to use new versions of Plone/Zope (Plone 6.0.7) Tested also with latest buildout.coredev

What I expect to happen:

Upload a file larger than 1mb in a File content-type

What actually happened:

Upload is blocked because Zope has now a (1mb limit)

From the commits and documentation i see that it should be configurable, but i can't do it. Anyone has a working example?

1mb is probably very low for regular pdf files that users usually upload on a website.

What version of Plone/ Addons I am using:

Zope 5.8.5 Plone 6.0.7

mauritsvanrees commented 1 year ago

I see this too, with Plone 6.0.7 on a ClassicUI site with plone.restapi installed. I upload an image via a form, and get this:

2023-09-26 14:12:35,673 ERROR   [Zope.SiteErrorLog:35][waitress-0] BadRequest: http://localhost:8080/Plone3/++add++Image
Traceback (innermost last):
  Module ZPublisher.WSGIPublisher, line 181, in transaction_pubevents
  Module ZPublisher.WSGIPublisher, line 391, in publish_module
  Module ZPublisher.WSGIPublisher, line 269, in publish
  Module ZPublisher.BaseRequest, line 638, in traverse
  Module Products.PluggableAuthService.PluggableAuthService, line 244, in validate
  Module Products.PluggableAuthService.PluggableAuthService, line 560, in _extractUserIds
  Module plone.restapi.pas.plugin, line 100, in extractCredentials
  Module plone.restapi.deserializer, line 8, in json_body
  Module ZPublisher.HTTPRequest, line 1058, in get
  Module ZPublisher.HTTPRequest, line 1370, in __get__
zExceptions.BadRequest: data exceeds memory limit

Actually, even without plone.restapi installed (only available in the eggs), I see the same traceback. This is strange, because the jwt_auth PAS plugin is not even in acl_users. It should not be called then. But that is a separate issue.

Adding <dos_protection> in parts/instance/etc/zope.conf gives a startup error: "Error: unknown type name: 'dos_protection'". Ah, I will try the PR from @mamico: https://github.com/plone/plone.recipe.zope2instance/pull/191

mauritsvanrees commented 1 year ago

Yes, that works.

I have released the fix from @mamico in plone.recipe.zope2instance 6.12.2. (I should have called that 6.13.0 really. Oh well.)

Add something like this in your instance/zeoclient recipe config:

zope-conf-additional =
  <dos_protection>
    form-memory-limit 4MB
  </dos_protection>

(I think I will go for 10MB for most of my customers.)

Thanks!

yurj commented 1 year ago

Yes, that works.

I have released the fix from @mamico in plone.recipe.zope2instance 6.12.2. (I should have called that 6.13.0 really. Oh well.)

Add something like this in your instance/zeoclient recipe config:

zope-conf-additional =
  <dos_protection>
    form-memory-limit 4MB
  </dos_protection>

(I think I will go for 10MB for most of my customers.)

Thanks!

but this works only with buildout. What about cookiecutter?

mamico commented 1 year ago

(I think I will go for 10MB for most of my customers.)

@mauritsvanrees don't you think we should put a default for Plone equal or larger than 10MB?

PDFs and images are normally larger, I think many people might run into this problem after updating, don't you?

gotcha commented 1 year ago

Well, do use buildout ! :stuck_out_tongue_winking_eye:

I love it :wink:

MrTango commented 1 year ago

but this works only with buildout. What about cookiecutter?

@yurj the config is written into the zope.conf file, which you also have in a buildoutless setup like teh cookiecutter templates.

yurj commented 1 year ago

I mean the automated installation via cookiecutter should add this conf additional snippet too . If you upgrade the docker image, you cannot add this config (easly) and maybe downgrade your database is not an option.

https://6.docs.plone.org/install/manage-add-ons-packages.html

"For the backend, it uses cookiecutter-zope-instance to generate configuration files for a Zope WSGI instance."

mauritsvanrees commented 1 year ago

@mauritsvanrees don't you think we should put a default for Plone equal or larger than 10MB?

PDFs and images are normally larger, I think many people might run into this problem after updating, don't you?

Yes. Where would we put a default. Would that be in the recipe? Or do an override in CMFPlone?

I really don't have time for community work this week. The last few weeks have been too crazy for that. I will add the current solution to a few customers, as I saw Sentry errors coming in.

wesleybl commented 1 year ago

Will I always have to update this setting if I want higher limits?

mauritsvanrees commented 1 year ago

Will I always have to update this setting if I want higher limits?

Currently, yes.

It seems to work fine without this though as long as you do not have plone.restapi in the eggs (not an option for Volto of course). See also https://github.com/plone/plone.restapi/issues/1711

We could turn this into an option in plone.recipe.zope2instance, so that if you for example set zope-form-memory-limit = 4MB that the recipe automatically adds the necessary lines. And 4 MB could be the default (or whatever number we pick). And if you specify the option but leave it empty, the recipe would not add the dos_protection lines, so you would use the Zope default.

And the cookiecutter starter template could include the lines.

davisagli commented 1 year ago

To be honest, I'm surprised that a memory limit is needed for file uploads. It used to be the case that file uploads were streamed directly into a temporary file on disk, so even a large file would not use a lot of memory. But, perhaps that was true with ZServer and not waitress (I haven't checked). And, there could still be a DoS issue from filling up the disk with file uploads, so a limit probably still makes sense, whether or not the name mentions memory.

mauritsvanrees commented 1 year ago

There is a memory limit and a separate, much larger, disk limit. See these lines in the Zope PR that introduced this.

mauritsvanrees commented 1 year ago

Also note that in ClassicUI without plone.restapi in the eggs, I do not see the problem: I can upload a large image without problems. So restapi reads the request in a different way than what happens in ClassicUI, and it hits this part of the Zope code.

davisagli commented 1 year ago

Oh, that makes sense that there is a difference -- in the classic UI file uploads are encoded as multipart/form-data, but in the REST API they are base64-encoded inside the JSON request body.

sneridagh commented 1 year ago

Anybody taking care of the cookiecutter fix? I guess that it's more than a fix, since this is both a new and breaking option that we have to take care there.

In the meanwhile, we will revert to use 6.0.6.

yurj commented 1 year ago

There is a memory limit and a separate, much larger, disk limit. See these lines in the Zope PR that introduced this.

Looking at the code, there's no direct way to disable this feature. Reading the code, setting them to -1 should be the same as disable the feature. So Plone could ship with -1, and let people (docker, buildout, cookiecutter) decide if enable/set it.

tisto commented 1 year ago

It makes lots of sense to me to allow an upload limit but not set it by default. You might want to set this on project level when an instance is used in production. However, I wouldn't want such a limit on a local dev instance or internal instances.

jensens commented 1 year ago

I can do this for cookiecutter-zope-instance.

I created an issue https://github.com/plone/cookiecutter-zope-instance/issues/11

davisagli commented 1 year ago

It makes lots of sense to me to allow an upload limit but not set it by default. You might want to set this on project level when an instance is used in production.

@tisto I disagree, it's a safety protection, and the settings should be safe by default. But I would set the default much higher.

However, I wouldn't want such a limit on a local dev instance or internal instances.

Then you won't discover that large files are blocked until the system is in production.

wesleybl commented 1 year ago

To be honest, I'm surprised that a memory limit is needed for file uploads. It used to be the case that file uploads were streamed directly into a temporary file on disk, so even a large file would not use a lot of memory. But, perhaps that was true with ZServer and not waitress (I haven't checked). And, there could still be a DoS issue from filling up the disk with file uploads, so a limit probably still makes sense, whether or not the name mentions memory.

@davisagli that makes sense. Does Zope now use memory to transfer the file to disk? Or is it just validation? Why does this validation only occur when plone.restapi is in the eggs?

There is a memory limit and a separate, much larger, disk limit.

Would this disk limit be just for the way Classic Plone without restopi does it?

mauritsvanrees commented 1 year ago

It would still probably be nice to set the limit higher in Plone by default. Probably somewhere in Products/CMFPlone/patches. Does anyone want to pick this up?

wesleybl commented 1 year ago

@mauritsvanrees wouldn't it be worth checking with the Zope people to increase this limit there? Is Zope file upload not affected by this issue? 1mb is not enough for any type of application.

wesleybl commented 1 year ago

I also don't know what this limit is protecting. As noted by @mauritsvanrees, if plone.restapi is not in the eggs, the upload works. I'm not sure this limit would block uploading via plone.restapi. The error occurs when trying to read the authentication credentials from the request body. We haven't gotten to the part about saving the file.

davisagli commented 1 year ago

Keep in mind that the file is serialized as part of the request body which is being read by the plugin. But, it doesn't make any sense to me that the plugin is trying to read the body as JSON even if it hasn't been sent with a Content-Type: application/json header.

If we add a check in plone.restapi for the request mimetype, that should avoid breaking uploads in the classic UI. The limit would still be a problem for non-TUS uploads through the REST API, which are serialized as base64 in the JSON body.

wesleybl commented 1 year ago

@davisagli I did a test by commenting on the line that gives the problem in plugin. We will actually get an error when trying to add a file through Volto:

2023-10-18 17:44:55 ERROR [Zope.SiteErrorLog:35][waitress-3] BadRequest: http://localhost:3000/POST_application_json_
Traceback (innermost last):
  Module ZPublisher.WSGIPublisher, line 181, in transaction_pubevents
  Module ZPublisher.WSGIPublisher, line 391, in publish_module
  Module ZPublisher.WSGIPublisher, line 285, in publish
  Module ZPublisher.mapply, line 98, in mapply
  Module ZPublisher.WSGIPublisher, line 68, in call_object
  Module plone.rest.service, line 22, in __call__
  Module plone.restapi.services, line 19, in render
  Module plone.restapi.services.content.add, line 34, in reply
  Module plone.restapi.deserializer, line 8, in json_body
  Module ZPublisher.HTTPRequest, line 1058, in get
  Module ZPublisher.HTTPRequest, line 1370, in __get__
zExceptions.BadRequest: data exceeds memory limit
mauritsvanrees commented 1 year ago

When I upload a larger Image in the ZMI, no exception is raised. So either the multipart parser is not used here, or in this case the higher FORM_DISK_LIMIT is used.

The limits are passed onto this class. It has the same mem and disk limits as Zope now has. Its "memfile" limit is larger, but that does not seem to come into play here (262kB vs 4 kB).

@davisagli wrote:

Keep in mind that the file is serialized as part of the request body which is being read by the plugin. But, it doesn't make any sense to me that the plugin is trying to read the body as JSON even if it hasn't been sent with a Content-Type: application/json header.

If we add a check in plone.restapi for the request mimetype, that should avoid breaking uploads in the classic UI. The limit would still be a problem for non-TUS uploads through the REST API, which are serialized as base64 in the JSON body.

That may be a good way.

yurj commented 1 year ago

https://github.com/zopefoundation/Zope/blob/29e088e04db9feb9f5d3bc6bea8950d98f8283a1/src/Zope2/Startup/wsgischema.xml#L83

the form limit parameters are configurable in wsgischema.xml

wesleybl commented 1 year ago

The limit description says:

       The maximum size for each part in a multipart post request,
       for the complete body in an urlencoded post request
       and for the complete request body when accessed as bytes
       (rather than a file).

@davisagli do you see another way for plone.restapi to send the file? Aren't we doing something wrong?

davisagli commented 1 year ago

@wesleybl Ideally I think file uploads should happen via the TUS API (https://6.docs.plone.org/plone.restapi/docs/source/endpoints/tusupload.html#creating-an-upload-url) so they're sent in a separate request that doesn't have to be base64-encoded as part of a JSON body.

But it'll take some work to get to the point where we can use this as the default in volto. For one thing, restapi's TUS implementation currently puts temporary files on disk, and it needs to store them as blobs in the ZODB so that it works across load balanced instances that don't share the same disk.

yurj commented 1 year ago

https://github.com/plone/cookiecutter-zope-instance/pull/12/commits/72c54dac37d78dfb4d7a52f390fadf4aeea73108

"Add dos_protection_* settings for feature introduced in https://github.com/zopefoundation/Zope/pull/1142"

d-maurer commented 1 year ago

Please consider "https://github.com/zopefoundation/Zope/pull/1180#issuecomment-1787198333": Plone (plone.restapi) should not load complete file contents into memory but keep them as file like objects.

Rudd-O commented 1 year ago

in the classic UI file uploads are encoded as multipart/form-data, but in the REST API they are base64-encoded inside the JSON request body.

This is a very bad regression since these JSON things needs to be loaded in RAM rather than going straight to a tmp file which then gets moved into a blob.

We're supposed to be supporting video uploads these days.

d-maurer commented 1 year ago

Rudd-O wrote at 2023-11-1 16:38 -0700:

in the classic UI file uploads are encoded as multipart/form-data, but in the REST API they are base64-encoded inside the JSON request body.

This is a very bad regression since these JSON things needs to be loaded in RAM rather than going straight to a tmp file which then gets moved into a blob.

We're supposed to be supporting video uploads these days.

I do not say that Plone should not support video uploads. I say, Plone should do it differently.

Why must these "JSON things" be loaded into RAM?

If files are passed via "multipart/form-data" requests, they arrive as file like objects at Plone, avoiding huge RAM requirements. plone.restapi could incorporate the file attachments into the the main object (but should of course not read the content), thus providing a good degree of backward compatibility. The application would only need to be aware that file content can arrive as a file like object.

The current solution effectlively prevents sites with little RAM to reliably support large uploads -- even if you configure form-memory-limit to effectively have no effect. Its current RAM requirement is at least twice the upload size (once for the file content as part of the JSON body, once as part of the deserialized object).

sneridagh commented 1 year ago

I'd like to rise this as an issue today in the next Plone's Steering Circle. Specially the part where this breaking change was not properly communicated to the Plone Release Team so we could issue the proper procedures to avoid the chaos that followed. I would also raise the issue that it was introduced in https://github.com/zopefoundation/Zope/pull/1094 with no notice in the Zope's changelog whatsoever, and released in a patch version.

mauritsvanrees commented 1 year ago

A bit of history:

So what has been done so far to combat this, and what still needs to be done?

What is not good, is that after Plone 6.0.7 until this week no one stepped up to tackle this, except for the workarounds that need everyone to edit zope.conf. There is no easy solution to that of course: everyone is busy fixing or developing other stuff that is also important.