paws-r / paws

Paws, a package for Amazon Web Services in R
https://www.paws-r-sdk.com
Other
315 stars 37 forks source link

Can't get s3 bucket object when using SSE [minio] [s3 object] [encryption] #718

Closed odysseu closed 9 months ago

odysseu commented 10 months ago

Hi, I am trying to copy to and get from minio s3's bucket a file encrypted with sse-c. But I can't find a way to use paws' configs, can anyone help ?

Works in python

When I use boto3 in python, it works fine :

endpoint_url='https://my.endpoint.fr'
aws_access_key_id='myaccesskey'
aws_secret_access_key='mysecretkey'
bucket_name = 'mybucketname'
object_key = 'myobject/key.txt'
encryption_key = 'thekeyiused32bits'
sse_c_algorithm = 'AES256'

import boto3

s3 = boto3.client('s3',
                  endpoint_url=endpoint_url,
                  aws_access_key_id=aws_access_key_id,
                  aws_secret_access_key=aws_secret_access_key,
                  config=boto3.session.Config(signature_version='s3v4'),
                  #addressing_style='path',
                  region_name='',
                  use_ssl=True)

response = s3.get_object(Bucket=bucket_name, Key=object_key, SSECustomerKey=encryption_key, SSECustomerAlgorithm=sse_c_algorithm)

content = response['Body'].read()
print(content.decode('utf-8'))

which ouputs the content of the file 👍

Does not work in R

From @DyfanJones in https://github.com/cloudyr/aws.s3/issues/433#issuecomment-1805612553 I understand it should be pretty easy to implement the same in R with paws but here's what I get :

library(paws)

Sys.setenv("AWS_ACCESS_KEY_ID" = "...",
           "AWS_SECRET_ACCESS_KEY" = "...",
           "AWS_DEFAULT_REGION" = "us-east-1",
           "AWS_S3_ENDPOINT"= "my.endpoint.fr")

access_key <- "..."
secret_key <- "..."
bucket_name <- "projet-test"
object_key <- "myobject/key.txt"
sse_c_key <- "..."
ssec_algorithm <- "AES256"
### 

minio <- paws::s3(
  config = list(
    credentials = list(
      creds = list(
        access_key_id = Sys.getenv("AWS_ACCESS_KEY_ID"),
        secret_access_key = Sys.getenv("AWS_SECRET_ACCESS_KEY"),
        session_token = Sys.getenv("AWS_SESSION_TOKEN")
        )
      ),
    endpoint = paste0("https://", Sys.getenv("AWS_S3_ENDPOINT")),
    region = Sys.getenv("AWS_DEFAULT_REGION")
    )
  ) # OK

minio$list_buckets() # OK

I can't find the way to use the SSE options :

minio$get_object(SSECustomerAlgorithm = "AES256",
                 Bucket="mybucketname", 
                 Key = "myobject/key.txt",
                 SSECustomerKey = paste0("projet-test/myobject/=",sse_c_key)
                 )
# Error in file(what, "rb") : cannot open the connection
# In addition: Warning message:
#   In file(what, "rb") :
#   cannot open file 'projet-test/open_data/=...sseckey...': No such file or directory
But also tried whithout the bucket file path : ```R minio$get_object(SSECustomerAlgorithm = "AES256", Bucket="mybucketname", Key = "myobject/key.txt", SSECustomerKey = sse_c_key) ) # Error in file(what, "rb") : cannot open the connection # In addition: Warning message: # In file(what, "rb") : # cannot open file 'sse_c_key': No such file or directory ```

However I get another error which I also don't understand when I use the sse key written in a local file localfile/key , and try to put a file myobject/key.txt instead of getting it :


minio$put_object("localfile/hello.txt",
                 Bucket = "mybucketname",
                 Key = "myobject/key.txt",
                 SSECustomerAlgorithm = "AES256",
                 SSECustomerKey = "localfile/key")
# Error: InvalidArgument (HTTP 400). Requests specifying Server Side Encryption with 
# Customer provided keys must provide the client calculated MD5 of the secret key.

would appreciate help :)

DyfanJones commented 10 months ago

Hi @odysseu I believe this is similar to this issue https://github.com/paws-r/paws/issues/611. This functionality is not fully supported in paws as of yet. Sorry about, I will look into how to support this in the next release of paws.common

DyfanJones commented 10 months ago

Reference links for future implementation:

DyfanJones commented 10 months ago

Found out what is going on. It looks like we don't build the SSECustomerKeyMD5. However if you add it in it works fine

library(paws)

content_md5 <- function(body) {
  hash <- digest::digest(body, serialize = FALSE, raw = TRUE)
  base64enc::base64encode(hash)
}

KEY <- openssl::rand_bytes(32)
BUCKET <- 'myBucket'

client <- s3(config(credentials(profile = "paws")))
client$put_object(
  Bucket=BUCKET,
  Key='encrypt-key-2',
  Body=charToRaw('foobar'),
  SSECustomerKey= KEY,
  SSECustomerAlgorithm='AES256',
  SSECustomerKeyMD5 = content_md5(KEY)
)
#> $Expiration
#> character(0)
#> 
#> $ETag
#> [1] "\"9ffc7a4fe7d4ffcfa38645707a78eeac\""
#> 
#> $ChecksumCRC32
#> character(0)
#> 
#> $ChecksumCRC32C
#> character(0)
#> 
#> $ChecksumSHA1
#> character(0)
#> 
#> $ChecksumSHA256
#> character(0)
#> 
#> $ServerSideEncryption
#> character(0)
#> 
#> $VersionId
#> character(0)
#> 
#> $SSECustomerAlgorithm
#> [1] "AES256"
#> 
#> $SSECustomerKeyMD5
#> [1] "GY/BEgOsrX+MI2ybGMR7sQ=="
#> 
#> $SSEKMSKeyId
#> character(0)
#> 
#> $SSEKMSEncryptionContext
#> character(0)
#> 
#> $BucketKeyEnabled
#> logical(0)
#> 
#> $RequestCharged
#> character(0)

resp <- client$get_object(
  Bucket=BUCKET,
  Key='encrypt-key-2',
  SSECustomerKey= KEY,
  SSECustomerAlgorithm='AES256',
  SSECustomerKeyMD5 = content_md5(KEY)
)

rawToChar(resp$Body)
#> [1] "foobar"

Created on 2023-12-01 with reprex v2.0.2

DyfanJones commented 10 months ago

I will check out other sdks to see how they handle this but I think if we add the MD5 builder to the custom s3 methods it should fix this.

DyfanJones commented 10 months ago

(Implementation Note) Something to consider: https://github.com/boto/botocore/blob/54a09c7d025181b8221d0046eb6dd6def9ace338/botocore/handlers.py#L287-L294C36

https://github.com/aws/aws-sdk-go/blob/1371ed99dade3fe52505d9bdcc945f7adecf9810/service/s3/sse.go#L62

DyfanJones commented 10 months ago

Hi @odysseu, I believe I have fixed this issue, please feel free to try out the dev version:

remotes::install_github("DyfanJones/paws/paws.common", ref = "sse_md5")
library(paws)

KEY <- openssl::rand_bytes(32)
BUCKET <- 'mybucket'

client <- s3(config(credentials(profile = "paws")))
resp1 <- client$put_object(
  Bucket=BUCKET,
  Key='encrypt-key-1',
  Body=charToRaw('foobar'),
  SSECustomerKey= KEY,
  SSECustomerAlgorithm='AES256'
)

resp2 <- client$get_object(
  Bucket=BUCKET,
  Key='encrypt-key-1',
  SSECustomerKey=KEY,
  SSECustomerAlgorithm='AES256'
)
resp2$Body |> rawToChar()
#> [1] "foobar"

# saving key to file for later use:
temp_file <- tempfile()
writeLines(rawToChar(KEY), temp_file, sep = "")

resp3 <- client$put_object(
  Bucket=BUCKET,
  Key='encrypt-key-2',
  Body=charToRaw('did it work?'),
  SSECustomerKey=readBin(temp_file, "raw", n = file.size(temp_file)),
  SSECustomerAlgorithm='AES256'
)

resp4 <- client$get_object(
  Bucket=BUCKET,
  Key='encrypt-key-2',
  SSECustomerKey=readBin(temp_file, "raw", n = file.size(temp_file)),
  SSECustomerAlgorithm='AES256'
)
resp4$Body |> rawToChar()
#> [1] "did it work?"

Created on 2023-12-01 with reprex v2.0.2

DyfanJones commented 9 months ago

Closing ticket as paws.common 0.7.0 has been released to the cran