cloudyr / aws.s3

Amazon Simple Storage Service (S3) API Client
https://cloud.r-project.org/package=aws.s3
381 stars 147 forks source link

parse_aws_s3_response: print(out) only if isTRUE(verbose) #395

Open cstepper opened 3 years ago

cstepper commented 3 years ago

Hi,

thanks for this great package!

I had the following issue when bulk-downloading data from a bucket (with non-existing keys) using save_object() (related to #228 and #306)

As the current implementation uses print() in parse_aws_s3_response() for printing the http response to the terminal in case of http errors , these options do not suppress this:

When trying to download a lot of objects, the terminal is cluttered with all the printed statements.

library(aws.s3)
library(dplyr)

bucket = "dataforgood-fb-data"
prefix = "csv/month=2019-06/country=BEN"

content = aws.s3::get_bucket_df(bucket, prefix = prefix)
content = content %>% 
  mutate(
    Size = R.utils::hsize(as.numeric(Size)),
    Path = file.path(tempdir(), "BEN", basename(Key))
  ) %>% 
  select(Key, Size, Bucket, Path)

# keep only 1st entry, break 2nd key
content_sub = content %>% 
  slice_head(n = 2)

content_sub$Key[2] = file.path(dirname(content_sub$Key[2]), "abc.csv.gz")

content_sub
#>                                                                                     Key    Size              Bucket                                               Path
#> 1 csv/month=2019-06/country=BEN/type=children_under_five/BEN_children_under_five.csv.gz 6.3 MiB dataforgood-fb-data /tmp/Rtmp1z0wTX/BEN/BEN_children_under_five.csv.gz
#> 2                         csv/month=2019-06/country=BEN/type=elderly_60_plus/abc.csv.gz 6.3 MiB dataforgood-fb-data     /tmp/Rtmp1z0wTX/BEN/BEN_elderly_60_plus.csv.gz

# save objects (try)
res = purrr::map2(
  .x = content_sub$Key, 
  .y = content_sub$Path, 
  .f = ~try(
    aws.s3::save_object(object = .x, bucket = bucket, file = .y),
    silent = TRUE
  )
)
#> List of 5
#>  $ Code     : chr "NoSuchKey"
#>  $ Message  : chr "The specified key does not exist."
#>  $ Key      : chr "csv/month=2019-06/country=BEN/type=elderly_60_plus/abc.csv.gz"
#>  $ RequestId: chr "JBWBAGGY2235FR52"
#>  $ HostId   : chr "VzlvI8tyT0dMDhzCyQhnepo3aGQ3KKSc/MELh4fvAkVnGaDVgbLuxI1f8MRUdFmw9bmqLLDOxx0="
#>  - attr(*, "headers")=List of 6
#>   ..$ x-amz-request-id : chr "JBWBAGGY2235FR52"
#>   ..$ x-amz-id-2       : chr "VzlvI8tyT0dMDhzCyQhnepo3aGQ3KKSc/MELh4fvAkVnGaDVgbLuxI1f8MRUdFmw9bmqLLDOxx0="
#>   ..$ content-type     : chr "application/xml"
#>   ..$ transfer-encoding: chr "chunked"
#>   ..$ date             : chr "Mon, 13 Sep 2021 09:42:26 GMT"
#>   ..$ server           : chr "AmazonS3"
#>   ..- attr(*, "class")= chr [1:2] "insensitive" "list"
#>  - attr(*, "class")= chr "aws_error"
#>  - attr(*, "request_canonical")= chr "GET\n/dataforgood-fb-data/csv/month%3D2019-06/country%3DBEN/type%3Delderly_60_plus/abc.csv.gz\n\nhost:s3.amazon"| __truncated__
#>  - attr(*, "request_string_to_sign")= chr "AWS4-HMAC-SHA256\n20210913T094227Z\n20210913/us-east-1/s3/aws4_request\n38938aaacd4e153aa3d60ad428825a535958c7b"| __truncated__
#>  - attr(*, "request_signature")= chr "AWS4-HMAC-SHA256 Credential=AKIA3ATRX55B3CXRVM35/20210913/us-east-1/s3/aws4_request,SignedHeaders=host;x-amz-da"| __truncated__
#> NULL

Created on 2021-09-13 by the reprex package (v2.0.1)

One option (at user side) would be to wrap it like capture.output(aws.s3::save_object(object = key, bucket = bucket), file = 'NUL'), but this changes to character string output.

What about print(out) only on isTRUE(verbose)?