paws-r / paws

Paws, a package for Amazon Web Services in R
https://www.paws-r-sdk.com
Other
315 stars 37 forks source link

Potential issue with AWS default S3 region of us-east-1 #788

Closed payam-delfi closed 2 months ago

payam-delfi commented 4 months ago

Hello! First off, thank you for this amazing package. It has such a nice interface and very well documented.

I had a question and a potential bug.

Question: I didn't see this in the docs, but is it true that the paws package itself functions from its sub packages? For example, is paws::s3() calling paws.storage::s3()?

Potential bug: When region = us-east-1 is set in the ~/.aws/config file, there seems to be an issue with running this:

s3 <- paws::s3()

s3_object <- s3$get_object(Bucket = "my-bucket", Key = "some/key")

Which gives this cryptic error:

Error in eval(ei, envir) : 
  Expecting a single string value: [type=list; extent=1].

The is the same error we'd get if the region was not set in the config file or through environment variable.

But when I change the region to anything else, like us-east-2, it works just fine. Note that the bucket I'm trying to read is on us-west-1.

Looking at the logs with options(paws.log_level = 3L), I see this difference:

When region = us-east-1 in config file:

-> Host: portal.sso.us-west-2.amazonaws.com
...
-> Host: my-bucket.s3.amazonaws.com             # <------------ Note there is no region here
...
-> 
INFO [2024-06-04 09:21:23.233]: <- HTTP/1.1 400 Bad Request

When region is something else in config file:

-> Host: portal.sso.us-west-2.amazonaws.com
...
-> Host: my-bucket.s3.us-west-1.amazonaws.com   # <------------ But the bucket region is present here
...
-> 
INFO [2024-06-04 10:26:31.292]: <- HTTP/1.1 200 OK
DyfanJones commented 4 months ago

Hi @payam-delfi

Question: I didn't see this in the docs, but is it true that the paws package itself functions from its sub packages? For example, is paws::s3() calling paws.storage::s3()?

Yes, paws is created from it's category packages: paws.analytics, paws.application.integration, paws.compute, paws.cost.management, paws.customer.engagement, paws.database, paws.developer.tools, paws.end.user.computing, paws.machine.learning, paws.management, paws.networking, paws.security.identity, paws.storage. It acts like a meta package and does exactly what you say: paws::s3 -> paws.storage::s3

It looks like we have a potential issue when we do some re-directly.

Note: region = us-east-1 does indeed create the host my-bucket.s3.amazonaws.com. This aligns to aws endpoints:

endpoints = list(
  "us-gov-west-1" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "us-west-1" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "us-west-2" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "eu-west-1" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "ap-southeast-1" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "ap-southeast-2" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "ap-northeast-1" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "sa-east-1" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "us-east-1" = list(endpoint = "s3.amazonaws.com", global = FALSE),
  "*" = list(endpoint = "s3.{region}.amazonaws.com", global = FALSE),
  "cn-*" = list(endpoint = "s3.{region}.amazonaws.com.cn", global = FALSE),
  "eu-isoe-*" = list(endpoint = "s3.{region}.cloud.adc-e.uk", global = FALSE),
  "us-iso-*" = list(endpoint = "s3.{region}.c2s.ic.gov", global = FALSE),
  "us-isob-*" = list(endpoint = "s3.{region}.sc2s.sgov.gov", global = FALSE),
  "us-isof-*" = list(endpoint = "s3.{region}.csp.hci.ic.gov", global = FALSE)
)
payam-delfi commented 4 months ago

Thanks for explaining that. Right now this is my workaround:

s3 <- paws.storage::s3()

# If default region is set to "us-east-1" or is empty, we need to re-initialize the
# S3 service and provide a region. This step is necessary or else the `get_object()` call
# gives us this error: Expecting a single string value: [type=list; extent=1].
if (get_region() == "us-east-1" || get_region() == "") {
  bucket_region <- s3$get_bucket_location("my-example-bucket")[["LocationConstraint"]]
  s3 <- paws.storage::s3(config = list(region = bucket_region))
}

The get_region() being:

get_region <- function() {
  paws.common::locate_credentials()[["region"]]
}
tyner commented 3 months ago

I wondered about this as well!

DyfanJones commented 3 months ago

Currently on holiday really sorry about my slow replies. Will investigate this properly when I get back. However in the meantime please feel free to raise any PR if you believe you have a solution for this. I do appreciate PRs as paws is a beast of a SDK package.

payam-delfi commented 3 months ago

No worries at all, enjoy your time off.

The workaround I posted above works pretty well for us, if other want to try it out for now.

I haven't dug deep into the code base, but I will try to take a look, see if I can make a PR.

DyfanJones commented 3 months ago

@payam-delfi just finished my holiday. What paws.common version do you have?

payam-delfi commented 3 months ago

Welcome back!

I installed the 3 packages below individually:

DyfanJones commented 3 months ago

@payam-delfi I believe I have a fix for this issue. Please try the dev version and let me know.

remotes::install_github("dyfanjones/paws/paws.common", ref = "fix-redirect")
DyfanJones commented 3 months ago

I will leave this ticket open until paws.common 0.7.4 is released to the cran

payam-delfi commented 3 months ago

Thank you very much! I can confirm it's working now.

Really appreciate your help and support on this package. Amazing work.

DyfanJones commented 2 months ago

Closing as paws.common 0.7.4 has been released to the cran