cognitect-labs / aws-api

AWS, data driven
Apache License 2.0
725 stars 100 forks source link

Using the s3 client with digital ocean spaces by changing the endpoint. #150

Open curioustolearn opened 3 years ago

curioustolearn commented 3 years ago

Thank you for creating this library.

Dependencies

Be sure to list the precise libs and versions you are using ("the latest" might change by the time we're looking at your issue).

[com.cognitect.aws/api "0.8.474"]
[com.cognitect.aws/endpoints "1.1.11.842"]
[com.cognitect.aws/s3 "809.2.734.0"]

Description with failing test case

I am trying to use the client to connect to digital ocean spaces. As noted on digital ocean website the Spaces API aims to be interoperable with Amazon's AWS S3 API. By changing the endpoint one should be able to use an s3 client with digital ocean spaces. For example, the command line official aws client from Amazon works with digital ocean spaces without any problems, when one specifies https://nyc3.digitaloceanspaces.com as an endpoint.

I tried the following code:

(ns exploredb.awsconn
  (:require [cognitect.aws.client.api :as aws])
  (:require [cognitect.aws.credentials :as awscreds])
  )

;; Create a client
(def s3 (aws/client {:api :s3
                     :credentials-provider (awscreds/profile-credentials-provider "dospaces")
                     :endpoint-override {:protocol :https
                                         :hostname "nyc3.digitaloceanspaces.com"}}))

;; Tell the client to let you know when you get the args wrong
(aws/validate-requests s3 true)

(println (aws/invoke s3 {:op :ListBuckets}))

I was expecting this to show the list of buckets. Instead, I get:

#:cognitect.anomalies{:category :cognitect.anomalies/fault, :message No known endpoint.}

dchelimsky commented 3 years ago

I don't have a digitalocean account so I'm not in a good position to debug this, but I tried a couple of quick experiments.

I ran your example with my default credentials and got a 403 with "InvalidAccessKeyId", which means the request is making it their server, which is different from the behavior you're seeing (the error message you see comes from the client before trying to send the request).

The example in their docs has the region aligned with the subdomain. When I set the region to "nyc3", I got the "No known endpoint." error, so maybe that's where the problem lies. What is the region you have configured?

curioustolearn commented 3 years ago

Thank you @dchelimsky. My region is indeed set to "nyc3". I was using profile-credentials-provider and my ~/.aws/config is


[profile dospaces]
region = nyc3
output = json

However, since you said that you did not get the end point error when you did not set the region to "nyc3", I wanted to try that out. So I switched to basic-credentials-provider (assuming that it does not use the profile settings in that case). With this the relevant lines in the code are as below. However, I still get #:cognitect.anomalies{:category :cognitect.anomalies/fault, :message No known endpoint.} error.

I am wondering why your request at least hits their server, but mine fails to.


(def s3 (aws/client {:api :s3
                     :credentials-provider (awscreds/basic-credentials-provider
                                            {:access-key-id "accesskeyhere"
                                             :secret-access-key "secretkeyhere"})
                     :endpoint-override {:protocol :https
                                         :hostname "nyc3.digitaloceanspaces.com"}}))
dchelimsky commented 3 years ago

Credentials and region have separate providers, so region could still be "nyc2", depending on how your env is configured. What is the output of (-> s3 cognitect.aws.client/-get-info :region-provider cognitect.aws.region/fetch)?

curioustolearn commented 3 years ago

Credentials and region have separate providers, so region could still be "nyc2", depending on how your env is configured. What is the output of (-> s3 cognitect.aws.client/-get-info :region-provider cognitect.aws.region/fetch)?

The region output from this code snippet was "nyc3".

Since region was causing the trouble, I kept everything the same but specified region explicitly to equal to "us-east-2" which is an Amazon AWS region and does not exist in Digital Ocean. That worked!! I can see the buckets listed (from Digital Ocean).

This is what I tried and worked.

(def s3 (aws/client {:api :s3
                     :region "us-east-2"
                     :credentials-provider (awscreds/basic-credentials-provider
                                            {:access-key-id "accesskeyhere"
                                             :secret-access-key "secretkeyhere"})
                     :endpoint-override {:protocol :https
                                         :hostname "nyc3.digitaloceanspaces.com"}}))

So it appears that the library is checking to see whether the region name is a valid Amazon AWS region even when a different endpoint is specified.

Do you think it is reliable to use the library in this fashion? Is the fix easy on your end?

Thank you so much for helping me with this.

curioustolearn commented 3 years ago

Hi @dchelimsky

Can you please confirm whether your library is checking to see whether the region name is a valid Amazon AWS region even when a different endpoint is specified? If yes, will there be an update made so that this does not throw an error for non-aws endpoints. The reason I am asking is that, as I said above, things are working now. But I am not sure if it is reliable to use it in this manner (even assuming that Digital ocean spaces API remains s3 compatible). Thank you.

kommen commented 3 years ago

This issue is not only with DigitalOcean. The same happens with @exoscale's Object Storage service.

(def s3 (aws/client {:api :s3
                     :region "at-vie-1"
                     :endpoint-override {:hostname "sos-at-vie-1.exo.io"}}))
(aws/invoke s3 {:op :ListBuckets})
;; => #:cognitect.anomalies{:category :cognitect.anomalies/fault, :message "No known endpoint."}

However, moving the :region key into the :endpoint-override makes it work for me:


(def s3 (aws/client {:api :s3
                     :endpoint-override {:hostname "sos-at-vie-1.exo.io"
                                         :region "at-vie-1"}}))
(aws/invoke s3 {:op :ListBuckets})
;; => {:Buckets [{,,,

Tested with:

com.cognitect.aws/api       {:mvn/version "0.8.484"}
com.cognitect.aws/endpoints {:mvn/version "1.1.11.914"}
com.cognitect.aws/s3        {:mvn/version "810.2.801.0"}
dchelimsky commented 3 years ago

@kommen I don't think there is any code that looks at :region within the :endpoint-override map. My guess is that if you leave it out completely it will work the same way, probably because you've got the right region configured somewhere in your environment. Please confirm.

kommen commented 3 years ago

No, without changing anything else than removing the :region within the :endpoint-override map I get:

{:Error {:HostIdAttrs {},
:Message "The authorization header is malformed; the region 'eu-central-1' is wrong; expecting 'at-vie-1'",
:CodeAttrs {}, :RequestIdAttrs {}, :HostId "<redacted>", 
:MessageAttrs {}, :RequestId "<redacted>", :RegionAttrs {}, :Region "at-vie-1",
:Code "AuthorizationHeaderMalformed"}, :ErrorAttrs {},
:cognitect.anomalies/category :cognitect.anomalies/incorrect}
kommen commented 3 years ago

@dchelimsky I can provide Exoscale Object Storage credentials in private if you are interested in reproducing.

kommen commented 3 years ago

I tried putting it there since in default-endpoint-provider I saw that map being merged with the resolved map is documented to have a :region key:

https://github.com/cognitect-labs/aws-api/blob/93c6ac08141a168fd430ce3fb043e2b6f62e8bf4/src/cognitect/aws/endpoint.clj#L88-L93

https://github.com/cognitect-labs/aws-api/blob/93c6ac08141a168fd430ce3fb043e2b6f62e8bf4/src/cognitect/aws/endpoint.clj#L68

It was just a guess which turned out working, but I didn't verify this is the code path taken.

dchelimsky commented 3 years ago

@kommen thanks for pointing that out. This code looks at the region from the endpoint: https://github.com/cognitect-labs/aws-api/blob/c1f9393cf6399d35b140307abf96e95ebe6c627f/src/cognitect/aws/signers.clj#L144-L145

I think the underlying problem here is that there at least two places that store the region and they are not aligned. This may take some time to fix (mostly time to think about the right fix), but you can use the undocumented :region key directly on the :endpoint-override until we've got a real fix. Please do keep in mind that the fix may render your workaround unusable.

dchelimsky commented 3 years ago

@curioustolearn I think @kommen's workaround will work for you as well (for now). Please confirm.

olymk2 commented 2 years ago

does this workaround still work I am seeing similar issues with unable to fetch region error, when specifying the region in endpoint-overrides.

olymk2 commented 2 years ago

so seems the region needs to be in the url, you need an aws region in the main map and the do region repeated in the credentials-provider.

This got me up and running with the latest version of this library.

(def host "ams3.digitaloceanspaces.com")

(def s3 (aws/client {:api :s3
                     :region "us-east-2"
                     :endpoint-override
                     {:protocol :https, :hostname host, :region "ams3"},
                      :credentials-provider (credentials/basic-credentials-provider
                                             {:access-key-id     access-key
                                              :secret-access-key secret})}))
 (aws/invoke s3 {:op :ListBuckets})
piranha commented 2 years ago

I don't know if this needs to be repeated, but it took me some time to understand the solution, so I thought I'd add problem description.

The gist of the problem here is with default-endpoint-provider, which needs valid AWS region (like us-east-1 or us-east-2 etc) to return data. So supplying any valid region here works.

Also for Backblaze B2 I'm using there is no need to supply region in :endpoint-override since it's contained in the URL. I'd say this warrants adding a section on alternative AWS providers to README. :)