sharat87 / httpbun

An HTTP server with APIs useful in testing HTTP clients. Inspired by httpbin, but isn't a clone.
https://httpbun.com
Apache License 2.0
85 stars 10 forks source link

Image put/post returns no content #5

Closed kuhnroyal closed 1 year ago

kuhnroyal commented 1 year ago

Thanks for this project and for providing this as service!

I am trying to move some code that used httpbin which puts/posts an image/json. When trying this with httpbun I get a 200 response with content missing. I would still expect to get a JSON response. httpbin returns a data field with data:application/octet-stream;base64,DATA.

Is that something that could be implemented?

kuhnroyal commented 1 year ago

It actually returns Transfer-Encoding: chunked which might be the root of the problem.

sharat87 commented 1 year ago

@kuhnroyal, thanks for raising this. Can you share a minimal curl command that uploads a JPEG and demonstrates this issue please? It'll help me ensure we're on the same page, and that I'm not out investigating some other non-issue. Thanks!

kuhnroyal commented 1 year ago

Ok, this is kinda hard. My problem seems to be connected to the content length.

This one works:

curl --http1.1 -H 'user-agent: Dart/3.0 (dart:io)' -H 'content-type: image/png' --compressed -X PUT https://httpbun.org/put -d '‰PNG\x0d\x0a\x1a\x0a\x00\x00\x00\x0dIHDR\x00\x00\x00Ê\x00\x00\x00ú\x08\x03\x00\x00\x00×Á(7\x00\x00\x00ÆPLTEÿÿÿg·÷\x0dG¡B¥õZ²öbµ÷›Íùi¸÷\x0bF¡Èãü»Ýû\x15G•\x13H˜U°ö¿ßû=£õ\x003š\x16F\x000™\x11HœÃáûåòþ'"'"'ôµÚû‘Éù\x17A‚\x17C‰ŒÇùJ©õ\x17@\x15<x\x00\x18;¥µÖ\x00,˜öûÿ\x169o®¼Ú®Òï÷þE—Ý\x120`S¬ö\x0f,Z\x09#K\x114j\x06\x123\x06\x1eCÛìýs´í\x00/u ­Ç\x188v\x00(w\x00/ˆ\x003”\x02\x0c-\x0e=†\x0b+\\x0b4u\x0b0j\x0e?*‰Ó\x00\x18F\x060o\x00%ƒ\x00=¯`á›\x00\x00\x04îIDATxœíÜéR\x1aA\x10Àq\Äx€\x07h<\x12É¡É\x12E“h4\x1eäzÿ—ÊìÁÊâ°ÌÌ6öQÝ\x0f\x00õ«î¿»\x1f,\x1a\x0dÄYo-Íò\x1a¦dYŠd]ŒDwb›ŽîD%¥ÑâéI´xÛÈىJT²8\x09nñº\x13•¨äe%Z<AÉ®JT²(IGŒ\x04¹ø\x0e DÎNTBN‚\¼œNäH´\x13Ñ\x12-^%S\x12íÄ2rޅ;›¨\x129@Jpw\x02Y¼\x18‰\x16\x0f$‘Ó‰\x1c‰\x16o\x19äNäHät"G¢Å[\x06·“M1\x129lŠ‘È)^N'"'"'r$Z<A‰\x16o\x19-\x1eH"§\x139\x12-Þ2Z<DN'"'"'r$Z¼ep;Ù\x15#‘ÓÉ®\x1c‰˜âåt"H"§\x139\x12-Þ2Z<DN'"'"'r$Z¼e:ë*\x01‘@v¢\x12\x18\x09dñb$Z<DN'"'"'r$Z¼ep;Y\x13#ÑNT¢’ù\x03Y<²DLñ ;ÙW‰J„J ‹×¨D%Ô%/_üÞÁ\x0aØ\x1c<ýì\x0eÂNöZp³|€+\x01üÊ\x15•T,‚dâº\x10Š_„ùNä\—Jæ|,j'"'"'\x14d\x09 \x05]\x02f! \x01²€<"‰H\x00öBFRÛBHRÓBJRËBLR£}r’à½\x10”\x04ZHJ‚,D%\x01\x16\x04IËIâÝ>a‰ç^HK¼,Ä%\x1e\x16ò\x12g\x0b\x03‰cûŽ’Ïp¥Ö\x1b_‰Ó^\%Í­¥WP\x12ï8YÜ%`–0É\‹«äíV³ÙÜjAXB%s,^\x123\x00–pIeû¾\x12\x00KHñ\x13–Y{ñèd<u{©³“\x0aK€¤®¥®d†Åÿºj[êK¬–@‰™`\x0b„ÄÒ~¸$ØR¯ø\x09Ky/AÔ»1(ɔ¥–$Ì\x02s]Ï,u®+ðù\x02)™è¥¾ÄÛ\x02w]%‹£äc•Ä³}hI~c0\x12¯^à%é?ƒ\x03I|,°\x14–\x0e”Ľ—ÅH\x1a¯p\x12GË"®Ë}\x1c%Ní3‘8ôÂE2ßÂG2ïƐ%_¼$•\x16V;©´°“Ì셟d–…£¤i}V2+¾ÂÂVò¬}¦×•N¹\x17Β²…ñueSôÂ^RX\x04Hòöyw2ž¤\x17\x11;I-¸’\x06ÐNRË7TIcÿ5\x18%þ~%Ä\x12__ÿa1’ã\x1bdË:„%î_\x1f\x1f\x1eíܼgoÉ$G;\x1bì-ñI¿h–²³Ñ¾åm‰‡ý~¶”vµ%“¤K1”Þ6_K<<I%é}\x19\x0a_K<,Î+§pµ\x18‰YÊqv^9%bi‰\x07F’-å‰ÂÒ2–”(‘™.7K"ɗRÞ\x0a;K<\x18œœ\x14¾Ê\x14^7\x16Ÿ—2E‰"f\x16#ɖbß\x0a#K|>¹”IJ4\x1e&½$’A*yFy°°¤’á<\x0a‡\x1b3’¼”jÊ*yK|™,%½/\x1b%ŠØX\x12Éø¾¦)íI\x08ù^ŒÄN1oÆÓ\x10ږBR¢$/.\x16\x07…ö7gY⋌2LÞîsJòˆl›0ì\x14ª½ä’)ÊFou†ƒ„ź—øç»â¾2Jÿx§ÒAµ—ø"‘\x14\x14\x13ËQ;š\x0b¡hI$ù}¥”ÃÑÌ@¦\x07ûƦ,©$¥Ü›\x07dßÝ\x11\x11è¥dI:É)gÃ;\x0f\x06\x0dËDû…äüþ®·ê+1óŠ%».Cy\x189…n\x19ìös‹‘˜9\x1c\x05!ò!qc‰äòñnÆ«‰ëPèÅtrÿP~}\x0f³tO‘-¿~ÿ\x19z\x00\x14Ó>²åêï(¡Ô‡˜Áދ±€ì$JzA¶|º\x05’P°l\x03I\x08´\x0fhAo\x1f҂¾—.\x18\x05¿\x17µX-è7¦í[\x07}/ڋx\x0búiûÖAߋö"ނ~cÚ¾uÐ÷¢½ˆ· ß˜¶o\x1dô½h/4-ÿà,ÝÓÿ¯û\x11A\x1c\x03âÂ\x00\x00\x00\x00IEND®B`‚'

But if I increase the content length to ~15000 (I can't properly add this here) - I start getting curl: (18) transfer closed with outstanding read data remaining or HTTP/1 protocol error: peer closed connection without sending complete message body (incomplete chunked read) in mitmproxy.

Sometimes I also get an nginx error with 502.

sharat87 commented 1 year ago

Ahh okay, now that's an important piece of the puzzle. I've actually limited the incoming payload body size to 10000. I've originally done this to avoid having to deal with a single large incoming payload choking my resources, but looking at it now, it looks a little low.

Related code at https://github.com/sharat87/httpbun/blob/b7c19538ff81fb7fc055d74cf0ac170f0f195d5d/mux/mux.go#L60.

Can you confirm that if the payload is within 10kb, then things do work fine and as expected? 🤔 If yes, then the whole problem is with this explicit limit.

kuhnroyal commented 1 year ago

Well httbun returns 12288 as the highest content length I can reach by playing around in the CLI. But the difference is probably encoding related.

"headers": {
    "Accept": "*/*",
    "Accept-Encoding": "deflate, gzip",
    "Connection": "close",
    "Content-Length": "12288",
    "Content-Type": "image/png",
    "User-Agent": "Dart/3.0 (dart:io)"
  }

I will try to use a smaller sample image in my tests and see how it goes. Thanks!

kuhnroyal commented 1 year ago

Ok it consistently works with smaller samples. I am now just wondering if it would make more sense to return the binary data as Base64. I currently seem to be unable to compare it to the uploaded data.

sharat87 commented 1 year ago

I am now just wondering if it would make more sense to return the binary data as Base64. I currently seem to be unable to compare it to the uploaded data.

Good point. I'm a little hesitant of changing the behavior of the data field in the resulting JSON. So, perhaps add a base64Data field? But that would mean the uploaded image would show up >2x in the resulting JSON. Is that... okay? 🤔

sharat87 commented 1 year ago

On another note, I just realized that we do respond with Base64, when using a multipart upload content type. For example,

curl -d @image.jpg https://httpbun.com/anything

This results in data field having the escaped binary of the payload image.

But if we do this:

curl -F image=@image.jpg https://httpbun.com/anything

This results in data field being empty, but the files field gets an image field, with the Base64 of the uploaded image.

I'm now thinking this is fine and expected. The best I think we can to improve the situation with raw file uploads (not multipart), is perhaps add the base64Data field, as described in the previous comment. Let me know if that sounds useful. I'm not fully decided on if we should add that.

kuhnroyal commented 1 year ago

Hmm multipart doesn't help, because I want to test the normal upload :) What about a query flag to control the response format?

sharat87 commented 1 year ago

Hmm multipart doesn't help, because I want to test the normal upload :)

Fair enough. 🙂

What about a query flag to control the response format?

I'm wary of doing this, since query params are just included in the response JSON, with absolutely no special behaviour to them. I'd like to not add exceptions to that rule. (Rule being, no query param has any affect on the API's behaviour).

So, let me circle-back to the question here. We want to upload raw payload data, not as a multipart form request, and then get that binary payload back, encoded with Base64. Is this problem statement correct?

We already have a /payload, which responds with the request payload verbatim, with the same Content-Type. So, if we add a similar /payload-base64, which would respond with the payload encoded with Base64, and Content-Type always being text/plain, ... that should solve for what you're looking for, right?

kuhnroyal commented 1 year ago

So, let me circle-back to the question here. We want to upload raw payload data, not as a multipart form request, and then get that binary payload back, encoded with Base64. Is this problem statement correct?

Yes, but.. The only reason I want this, is because I have an old test which did it that way again httpbin and I can't figure out a way to compare the binary response content. I am uploading a PNG file and try to convert the response data field back to a png but it never matches. I don't need Base64 if I can solve that.

Do you have a similar test somewhere?

sharat87 commented 1 year ago

I am uploading a PNG file and try to convert the response data field back to a png but it never matches.

Got it. I don't have a test for this yet, but let me look into this. This is a problem statement I can test and troubleshoot. I'll get back soon, depending on bandwidth. 🙂

sharat87 commented 1 year ago

So, httpbin's behaviour is... just feels haphazard and incomplete. Check this out.

When I upload an image with the right image content-type, I don't get anything back.

> curl -d @image.jpg -H'content-type: image/jpeg' postman-echo.com/post
{
  "args": {},
  "data": {},
  "files": {},
  "form": {},
  "headers": {
    "x-forwarded-proto": "http",
    "x-forwarded-port": "80",
    "host": "postman-echo.com",
    "x-amzn-trace-id": "Root=1-6483f228-6e46d26d1c48dacd6bf60dc9",
    "content-length": "9022",
    "user-agent": "curl/7.88.1",
    "accept": "*/*",
    "content-type": "image/jpeg"
  },
  "json": null,
  "url": "http://postman-echo.com/post"
}

However, if I do it with application/octet-stream, the data field is an object!!!

> curl -d @image.jpg -H'content-type: application/octet-stream' postman-echo.com/post
{
  "args": {},
  "data": {
    "type": "Buffer",
    "data": [
      255,
      216,
      ...
      184,
      244,
      168,
      80
    ]
  },
  "url": "http://postman-echo.com/post"
}

That data.data list contains all the byte numbers, I think. Very long list.

This is one way to do it, instead of base64. I'm a little split on which would be better here. Thoughts?

But the behavior for image/jpeg definitely seems odd. I'd expect it to do the same thing as application/octet-stream. 🤷

kuhnroyal commented 1 year ago

Yea the 2nd output makes a lot more sense and is probably enough to compare my sent data.

sharat87 commented 1 year ago

Okay, even with application/octet-stream, postman-echo.com responds with the byte integer list, but httpbin.org (and pie.dev) respond with the base64 of the data. 🤷

But this also made the response size from postman-echo.com be ~13x that from httpbin.org. For example, 16Kb became 200Kb.

Considering this, I'm leaning towards base64. Also, from what I know, most languages come with standard utility functions to convert a base64 string to a byte-array (Python and Golang both do), and is usually more straightforward than converting a list of integers into a byte array.

I'm going to soon start with the base64 implementation, similar to how httpbin does it. Hope that's okay?

kuhnroyal commented 1 year ago

That would be great!

kuhnroyal commented 1 year ago

Thanks, confirmed working!