quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.59k stars 2.63k forks source link

RESTEasy Reactive - Server side multipart implementation is not being respected #27083

Closed technical-debt-collector closed 2 years ago

technical-debt-collector commented 2 years ago

Describe the bug

We have a multipart endpoint which accepts two parts handled as Strings such as this:

public class MultipartUpload {

    @RestForm
    @PartType(MediaType.TEXT_PLAIN)
    public String textPart;

    @RestForm
    @PartType(MediaType.APPLICATION_OCTET_STREAM)
    public String anotherTextPart;

}

This is after we migrated to RESTEasy Reactive. With the exception of anotherTextPart being of type InputStream instead nothing has really changed since then, so essentially a multipart where we're not interested in working with the parts as files.

What we weren't aware of was that some of our users are sending us parts with a filename provided, and whenever they do these two parts end up as null. I've tried looking into the implementation and it kind of looks like the MultipartParser will store everything into a file whenever a filename is provided. Haven't really understood the code completely but my guess is that the implementation will then fail to retrieve the data that was read unless the multipart is defined with File types. But I might be wrong here though.

Looking at the RFC-2183 for the Content-Disposition header, this behavior does not seem in line with the specification:

The presence of the filename parameter does not force an implementation to write the entity to a separate file. It is perfectly acceptable for implementations to leave the entity as part of the normal mail stream unless the user requests otherwise. As a consequence, the parameter may be used on any MIME entity, even `inline' ones. These will not normally be written to files, but the parameter could be used to provide a filename if the receiving user should choose to write the part to a file.

The users haven't changed anything in their implementation so it seems pretty evident that this has changed between Classic and Reactive.

Expected behavior

The server side implementation should be respected regardless of what the client sends. Unless a RESTEasy server multipart has implemented it, don't write parts of a multipart request to file, even if the Content-Disposition includes a filename. The request should still work as if the client had not provided a file name for the part in question, and such is the case in Classic.

Actual behavior

If a client sends a part with filename in the Content-Disposition header, the receiving end implementation writes the data to a file regardless of what the server side multipart implementation looks like. However, if the server side does not implement the part as a file it will not be able to retrieve the data for further use, i.e the part becomes null.

How to Reproduce?

No response

Output of uname -a or ver

No response

Output of java -version

No response

GraalVM version (if different from Java)

No response

Quarkus version or git rev

No response

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

quarkus-bot[bot] commented 2 years ago

/cc @FroMage, @geoand, @stuartwdouglas

geoand commented 2 years ago

The users haven't changed anything in their implementation so it seems pretty evident that this has changed between Classic and Reactive.

This is correct - we explicitly mention that the multipart support has been designed from scratch.

technical-debt-collector commented 2 years ago

But is this something you'd consider fixing? I think it's really unfortunate that an optional header value can ruin a completely valid request. For the time being we've had to revert back to RE classic.

geoand commented 2 years ago

Your issue is that you would like things that have a filename to not be considered a file?

If so, I'm pretty sure we've discussed it before (I need to find the issue) and rejected it, but I'd be open to hearing why it's so important

technical-debt-collector commented 2 years ago

My issue is that I'm creating a resource method in which I don't want to handle the data as files. If I'm not mistaken, this is exactly what happens now, and it's all up to the client to decide.

The server shouldn't have to store it as a file unless the request method is implemented to handle them as such. Shouldn't the users developing the actual endpoint decide what happens with the data?

But even if the above behavior is wanted, the resource method seemingly isn't able to use the data for some reason. As mentioned the parts are null when accessed. Why don't you think this is worth fixing?

By the way, are you thinking about this one? https://github.com/quarkusio/quarkus/issues/20938 They're pretty closely related although they are kind of opposites in this case if you're asking me.

geoand commented 2 years ago

If I'm not mistaken, this is exactly what happens now, and it's all up to the client to decide

The reason it's done the this way is to avoid having large payloads in memory.

But even if the above behavior is wanted, the resource method seemingly isn't able to use the data for some reason. As mentioned the parts are null when accessed. Why don't you think this is worth fixing

Not sure exactly what you are referring to here.

technical-debt-collector commented 2 years ago

While creating a reproducer I discovered this log message which clearly explains the situation: 2022-08-09 10:55:47,165 WARN [org.jbo.res.rea.ser.han.RequestDeserializeHandler] (executor-thread-0) Attribute 'anotherTextPart' of the multipart request is a file and therefore its value is not set. To obtain the contents of the file, use type 'interface org.jboss.resteasy.reactive.multipart.FileUpload' as the field type. This was of course hidden in our application due to the log configuration we're running with :)

Anyhow, the reproducer can be found here: https://github.com/technical-debt-collector/resteasy-multipart-reproducer There's a test class there that verifies the behaviour (resource methods are annotated as @Blocking so that we may intercept and print out the content headers during the requests).

After having looked at this again it seems like certain content types must be in place too (along with a filename) in order for RE Reactive to consider the part a file. If the test provides the values as text/plain it works, but if it provides the data as a byte[], then we get the log message and are left with an inaccessible part in the resource method.

In either case I still think this is unfortunate as we can't make the same multipart resource method handle both FileUploads and Strings as RE Classic would. Changing the multipart object to FileUpload as suggested by the log statement would fix the problem we're facing but would just introduce another in the form of https://github.com/quarkusio/quarkus/issues/20938.

geoand commented 2 years ago

Maybe @Sgitario can take a look at this when he has time with a fresh pair of eyes 😉

Sgitario commented 2 years ago

To sum up, what you're requesting (and what spec conforms) is that when getting the following request:

Content-Disposition: form-data; name="textPart"; filename="my_file.txt"
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary

Then, if we implement our resource as:

@Path("/example")
public class ExampleResource {

    @POST
    @Consumes(MediaType.MULTIPART_FORM_DATA)
    @Produces(MediaType.TEXT_PLAIN)
    public Uni<Response> post(@MultipartForm MultipartUpload request) {
    }
}

public class MultipartUpload {
    @RestForm
    @PartType(MediaType.TEXT_PLAIN)
    public String textPart;
}

Then, regardless of the value of the Content-Disposition and filename, the server should copy the content of the file my_file.txt into the textPart field.

Did I understand well the issue? If so, and taking into account what the spec says, I think we should indeed fix this issue.

technical-debt-collector commented 2 years ago

Yes, exactly.

teedjay commented 2 years ago

And likewise for binary content (which works great with regular RESTEasy), something like this :

@PartType(MediaType.APPLICATION_OCTET_STREAM)
public byte[] binaryPart;

@PartType(MediaType.APPLICATION_OCTET_STREAM)
public InputStream streamPart;
Sgitario commented 2 years ago

fyi https://github.com/quarkusio/quarkus/pull/27498

FroMage commented 2 years ago

In #27526 I changed this a bit.

The reasons are about UX:

So, I've kept the new features of this PR, but extended it to make it more general.