taylorfinnell / awscr-s3

A Crystal shard for S3.
https://taylorfinnell.github.io/awscr-s3/
MIT License
81 stars 48 forks source link

Handle objects with slashes #87

Closed matthewmcgarvey closed 3 years ago

matthewmcgarvey commented 3 years ago

Fixes #86

Switches from using URI.encode_www_form to URI.encode. Before, it was encoding the slash in the path when it shouldn't have. Check out the documentation for more info on the differences.

taylorfinnell commented 3 years ago

Thanks! I am still a bit confused because https://github.com/taylorfinnell/awscr-s3/issues/60 actually does work in master. But breaks with this change. And I am not sure which is correct :)

There could also be an issue in awscr-signer, not sure (tho it passes the aws test suite except for the noted limitations)

matthewmcgarvey commented 3 years ago

@taylorfinnell Could you take a second look at this? With the release of Crystal 0.36.1, this change needs to go through so that a new version of the library can be released.

taylorfinnell commented 3 years ago

Thanks guys,

I think it should encode the same way the signer does. Here is how the AWS Ruby gem does it

https://github.com/aws/aws-sdk-ruby/blob/07ed3aa1a682da8244f2f7eeaebf42e5369e6441/gems/aws-sigv4/lib/aws-sigv4/signer.rb#L694

Here is how awscr-signer does it

https://github.com/taylorfinnell/awscr-signer/blob/master/src/awscr-signer/v4/request.cr#L82

Here is how their client escapes the path: https://github.com/aws/aws-sdk-ruby/blob/203aaba7242357884fe182be8f2d36f77428b8a7/gems/aws-sdk-core/lib/seahorse/util.rb#L10

They escape the same way in both the signer and the client.

I wonder if we can use the same code as awscr-signer. It appears to work for me on the test object keys added in this PR.

diff --git a/examples/object_put_get.cr b/examples/object_put_get.cr
index b443b9c..bb6c6ad 100644
--- a/examples/object_put_get.cr
+++ b/examples/object_put_get.cr
@@ -12,9 +12,9 @@ client = Awscr::S3::Client.new(
   aws_secret_key: SECRET
 )

-object = UUID.random.to_s
+["notes/object.txt", "test=", "test object", "test'"].each do |object|
+  client.put_object(bucket: BUCKET, object: object, body: IO::Memory.new("Hey!"))

-client.put_object(bucket: BUCKET, object: object, body: IO::Memory.new("Hey!"))
-
-resp = client.get_object(bucket: BUCKET, object: object)
-puts resp.body
+  resp = client.get_object(bucket: BUCKET, object: object)
+  puts resp.body
+end
diff --git a/src/awscr-s3/client.cr b/src/awscr-s3/client.cr
index 9216599..6cbc2d3 100644
--- a/src/awscr-s3/client.cr
+++ b/src/awscr-s3/client.cr
@@ -94,7 +94,7 @@ module Awscr::S3
     # ```
     def start_multipart_upload(bucket : String, object : String,
                                headers : Hash(String, String) = Hash(String, String).new)
-      resp = http.post("/#{bucket}/#{URI.encode_www_form(object)}?uploads", headers: headers)
+      resp = http.post("/#{bucket}/#{encode(object)}?uploads", headers: headers)

       Response::StartMultipartUpload.from_response(resp)
     end
@@ -108,7 +108,7 @@ module Awscr::S3
     # ```
     def upload_part(bucket : String, object : String,
                     upload_id : String, part_number : Int32, part : IO | String)
-      resp = http.put("/#{bucket}/#{URI.encode_www_form(object)}?partNumber=#{part_number}&uploadId=#{upload_id}", part)
+      resp = http.put("/#{bucket}/#{encode(object)}?partNumber=#{part_number}&uploadId=#{upload_id}", part)

       Response::UploadPartOutput.new(
         resp.headers["ETag"],
@@ -141,7 +141,7 @@ module Awscr::S3
         end
       end

-      resp = http.post("/#{bucket}/#{URI.encode_www_form(object)}?uploadId=#{upload_id}", body: body)
+      resp = http.post("/#{bucket}/#{encode(object)}?uploadId=#{upload_id}", body: body)
       Response::CompleteMultipartUpload.from_response(resp)
     end

@@ -154,7 +154,7 @@ module Awscr::S3
     # p resp # => true
     # ```
     def abort_multipart_upload(bucket : String, object : String, upload_id : String)
-      resp = http.delete("/#{bucket}/#{URI.encode_www_form(object)}?uploadId=#{upload_id}")
+      resp = http.delete("/#{bucket}/#{encode(object)}?uploadId=#{upload_id}")

       resp.status_code == 204
     end
@@ -182,7 +182,7 @@ module Awscr::S3
     # p resp # => true
     # ```
     def delete_object(bucket, object, headers : Hash(String, String) = Hash(String, String).new)
-      resp = http.delete("/#{bucket}/#{URI.encode_www_form(object)}", headers)
+      resp = http.delete("/#{bucket}/#{encode(object)}", headers)

       resp.status_code == 204
     end
@@ -228,7 +228,7 @@ module Awscr::S3
     # ```
     def put_object(bucket, object : String, body : IO | String | Bytes,
                    headers : Hash(String, String) = Hash(String, String).new)
-      resp = http.put("/#{bucket}/#{URI.encode_www_form(object)}", body, headers)
+      resp = http.put("/#{bucket}/#{encode(object)}", body, headers)

       Response::PutObjectOutput.from_response(resp)
     end
@@ -241,7 +241,7 @@ module Awscr::S3
     # p resp.body # => "MY DATA"
     # ```
     def get_object(bucket, object : String, headers : Hash(String, String) = Hash(String, String).new)
-      resp = http.get("/#{bucket}/#{URI.encode_www_form(object)}", headers: headers)
+      resp = http.get("/#{bucket}/#{encode(object)}", headers: headers)

       Response::GetObjectOutput.from_response(resp)
     end
@@ -255,7 +255,7 @@ module Awscr::S3
     # end
     # ```
     def get_object(bucket, object : String, headers : Hash(String, String) = Hash(String, String).new)
-      http.get("/#{bucket}/#{URI.encode_www_form(object)}", headers: headers) do |resp|
+      http.get("/#{bucket}/#{encode(object)}", headers: headers) do |resp|
         yield Response::GetObjectStream.from_response(resp)
       end
     end
@@ -270,7 +270,7 @@ module Awscr::S3
     # p resp.last_modified # => "Wed, 19 Jun 2019 11:55:33 GMT"
     # ```
     def head_object(bucket, object : String, headers : Hash(String, String) = Hash(String, String).new)
-      resp = http.head("/#{bucket}/#{URI.encode_www_form(object)}", headers: headers)
+      resp = http.head("/#{bucket}/#{encode(object)}", headers: headers)
       Response::HeadObjectOutput.from_response(resp)
     end

@@ -296,5 +296,11 @@ module Awscr::S3
     private def http
       @http
     end
+
+    private def encode(object_id)
+      String.build do |io|
+        URI.encode(object_id, io) { |byte| URI.unreserved?(byte) || byte.chr == '/' || byte.chr == '~' }
+      end
+    end
   end
 end
matthewmcgarvey commented 3 years ago

@taylorfinnell I switched to matching the signer and thanks for the links to the ruby source. I could not find it in that monster of a codebase!

taylorfinnell commented 3 years ago

Thank you!