eclipse-ee4j / jersey

Eclipse Jersey Project - Read our Wiki:
https://github.com/eclipse-ee4j/jersey/wiki
Other
682 stars 340 forks source link

[Feature request] Configuration options for multipart requests (Jersey 3.1.x) #5570

Open paulrutter opened 4 months ago

paulrutter commented 4 months ago

Following https://github.com/jakartaee/rest/issues/418#issuecomment-2009059009, this is a feature request to get some configuration options for multipart requests into Jersey (or maybe into the JAX-RS spec even where it would make sense).

Ideally there would be a way to set both generic options as well as endpoint specific options.

The endpoint specific settings could be configured via an annotation (Jersey specific or JAX-RS), while the generic options could follow the approach mentioned in https://github.com/jakartaee/rest/issues/418#issuecomment-2009123458.

The additional options would be:

If either of these limits would be exceeded, an exception should be thrown, which by default should return a HTTP 413 status code.

The annotation approach would be useful to override generic options, per endpoint. Imagine that some endpoints will have different requirements about part size or part count.

commons-fileupload offers similar options, see https://commons.apache.org/proper/commons-fileupload/using.html.

Let me know what you think @jansupol. Thanks!

paulrutter commented 4 months ago

I think all of this logic can be implemented in the application itself, by using stuff like https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/input/CountingInputStream.html, but mimepull already goes through the body once, so having this logic baked into the framework would be neater and could throw an exception early when limits are exceeded. It's also a nice security feature to have, as opposed to implementing it yourself.

I know that commons-fileupload had a CVE recently, that led to the max part count option being introduced. See https://www.cvedetails.com/cve/CVE-2023-24998/

jansupol commented 4 months ago

It makes sense. We'll try our best for 3.1.7.

jansupol commented 2 months ago

Each EntityPart can be represented by an input stream. The InputStream can be read the same way the ordinary entity. The more common approach would be the ability to set "max request body size" for any entity, not just for multipart.

At that point, the bean validation could do similar stuff, we need to check the BV possibilities with an InputStream. I am guessing it won't be possible to limit the data read from the stream, though.

Interesting requirements, I am not aware of any similar coming for the ordinary requests.

jansupol commented 2 months ago

ReaderInterceptor can likely be used to watch over the maximum data read from a single part, as well as for an ordinary entity stream, too.

paulrutter commented 2 months ago

Thanks @jansupol. Are you saying that this should be the preferred way of implementing these limits in custom code? I think having this available out of the box in Jersey would still be beneficial to developers, but maybe i'm a bit biased ;)

I agree that setting the max request body size solves quite a lot of issues already, because then there is at least a general limit in place.

jansupol commented 2 months ago

This is me thinking loud mostly, hoping someone chimes in.

But yes, with the current Jersey, it can be solvable by the ReaderInterceptor.

For the future, we can implement some interceptor that would wrap the stream into some byte counter which can throw 413. Or there could be some more generic listener with something like beforeRead(ResourceInfo resourceInfo), afterRead(ResourceInfo resourceInfo, long dataRead). That could also be used for watching progress, similar to what Apache has. This may need a bit more thought.

paulrutter commented 2 months ago

I can see how that would work, following the example on https://www.baeldung.com/jersey-filters-interceptors#1-implementing-a-readerinterceptor. In that interceptor, the inputstream can be wrapped (and bytes counted), according to given limits. Will give it some more thought as well.

jansupol commented 2 months ago

The low-hanging fruit, to limit the number of parts in the multipart entity, is implemented by #5643.