Kong / unirest-java

Unirest in Java: Simplified, lightweight HTTP client library.
http://kong.github.io/unirest-java/
MIT License
2.6k stars 594 forks source link

multipart/form-data POST is missing content types #516

Closed ojhwel closed 7 months ago

ojhwel commented 7 months ago

Describe the bug I'm trying to POST a multipart message consisting of a file and some JSON metadata, but the outgoing request is missing the appropriate content types.

To Reproduce

DataSource content = ...
HttpResponse<ReturnType> response = unirest.post("/endpoint")
    .basicAuth(user, password)
    .field("content",
            content.getInputStream(),
            ContentType.create(content.getContentType()),
            content.getName())
    .field("metadata",
            metadata,
            ContentType.APPLICATION_JSON.getMimeType())
    .asObject(ReturnType.class);

Expected behavior I'd expect the request to look something like this:

POST http://localhost:8080/endpoint
Authorization=Basic dW5pcmVzdA==
Content-Type=multipart/form-data
===================================
--cc3acdc2-d912-4f4f-83ac-9c261016d278
Content-Disposition: form-data; name="content"; filename="a_simple.pdf"
Content-Type: application/pdf
<BINARY DATA>

--ed5ea565-50ec-475c-a53a-14c2df83b8c4
Content-Disposition: form-data; name:"metadata"
Content-Type: application/json
{"some":"json","data":"values"}

Screenshots What I'm getting, as printed by .toSummary().asString(), is this:

POST http://localhost:8080/endpoint
Authorization=Basic dW5pcmVzdA==
===================================
--cc3acdc2-d912-4f4f-83ac-9c261016d278
Content-Disposition: form-data; name="content"; filename="a_simple.pdf"
Content-Type: application/octet-stream
<BINARY DATA>

--ed5ea565-50ec-475c-a53a-14c2df83b8c4
Content-Disposition: form-data; name:"metadata"
{"some":"json","data":"values"}
  1. The content type for the PDF is changed to "application/octet-stream". (I have checked many times that ContentType.APPLICATION_PDF is what's being set.)
  2. The JSON string has no content type at all
  3. Less important: "Content-Type multipart/form-data" is also missing. I can fix this via a .header() call but I don't think that should be necessary when I have two field()s.

Environmental Data:

ryber commented 7 months ago

so the .toSummary().asString() is an approximation and is NOT the exact thing that was sent to the server, they may differ. For example, the main content header is actually sent, its just not in the summary. This is because there are aspects of multipart processing handled by the Java HttpClient that are lost to unirest, but others could be better for sure. I'm going to create a test to expose which parts are a defect with sending the data to a server and which ones are a issue with the summary

ojhwel commented 7 months ago

Thanks for the quick response. The trigger for me looking into this was a 415 Unsupported Media Type response citing "application/octet-stream", although the server side folks couldn't immediately say which part was the culprit. That's when I added the Interceptor.

ryber commented 7 months ago

So a little more information. The Summary is indded way off, for a bunch of reasons, mostly related to the idea that you can always call for the summary at any point in the lifecycle, before the request was made. and particularly for multipart forms which don't fully come together until the end. So I consider that a defect of sorts, Its never going to be 100%, but it needs to get closer.

As for the actual behavior of the Unirest. We have a nice suite of behavioral tests that stand up a little Jetty server that takes requests and echos back what was sent. This is the most accurate representation of the request.

You can see a example test here: https://github.com/Kong/unirest-java/blob/main/unirest-bdd-tests/src/test/java/BehaviorTests/MultiPartFormPostingTest.java#L426-L451

This has uncovered a few things that might result in your 415:

  1. The main content type sent for the overall request has multiple params in it and looks like this: multipart/form-data; boundary=4ebf68bc-70f8-462b-b3a5-48dadb236af3;charset=UTF-8

It might be that the server is looking for a strict "multipart/form-data" value and is not expecting the additional params.

  1. As for each of the parts, The PDF section was actually fine, it sends the type along as "application/pdf", however the Json part is missing a content-type entirely. I have a fix for this part. that is going out with 4.2.8. (mavens website lags behind what is actually in the repo by up to a day so give it a bit of time)
ryber commented 7 months ago

4.2.9 is on its way out the door and has a more accurate summary

ojhwel commented 7 months ago

Thanks, that works perfectly for our purposes.

ryber commented 7 months ago

closing this as complete