httprb / http

HTTP (The Gem! a.k.a. http.rb) - a fast Ruby HTTP client with a chainable API, streaming support, and timeouts
MIT License
3.01k stars 321 forks source link

Encoded plus (+) %2B in path segment somehow not working #654

Closed jrochkind closed 1 year ago

jrochkind commented 3 years ago

I have an S3 path that has a "+" in it, so it's encoded URI ends up with a "%2B" in it. Specifically, the S3 key is "A+B.txt", so the working S3 URL is -- and this is an actual working public URL that demonstrates it -- https://scih-data-dev.s3.amazonaws.com/A%2BB.txt

You can see you can access it with your browser, it's a file that just includes the string "content\n".

It can also be accessed with net::http:

irb(main):089:0> Net::HTTP.get(URI.parse("https://scih-data-dev.s3.amazonaws.com/A%2BB.txt"))
=> "content\n"

No problem. It can also be accessed via curl, for what it's worth. curl "https://scih-data-dev.s3.amazonaws.com/A%2BB.txt"

When I try to get it via http-rb however, S3 returns a 403 error.

irb(main):098:0> HTTP.get("https://scih-data-dev.s3.amazonaws.com/A%2BB.txt")
=> #<HTTP::Response/1.1 403 Forbidden {"X-Amz-Request-Id"=>"BF0W8PB090MAQW5B", "X-Amz-Id-2"=>"BTkuhpvYFsvHKdudR4W/tYNKhlq9I0LG188oWJ2MHLih7MErl32414MzRz7tpSGnQec134oVcYo=", "Content-Type"=>"application/xml", "Transfer-Encoding"=>"chunked", "Date"=>"Mon, 22 Mar 2021 21:02:57 GMT", "Server"=>"AmazonS3", "Connection"=>"close"}>

I think http-rb is doing something weird with escaping/unescaping there and requesting a different URL, which would then return a 403 from S3. I haven't debugged further this this yet, I'm not sure exactly what HTTP request is being made to the server -- but apparently not the right one, since all those other tools can retrieve this url?

Is there a bug of some kind when the URL contains %2B?