paws-r / paws

Paws, a package for Amazon Web Services in R
https://www.paws-r-sdk.com
Other
318 stars 37 forks source link

problem with s3 delete_objects #597

Closed dleopold closed 12 months ago

dleopold commented 1 year ago

I am unable to use delete_objects to batch delete objects from an s3 bucket. Copying the documentation exactly

svc$delete_objects(
  Bucket = "...",
  Delete = list(
    Objects = list(
      list(
        Key = "..."
      ),
      list(
        Key = "..."
      )
    ),
    Quiet = FALSE
  )
)

returns an error:

Error: MalformedXML (HTTP 400). The XML you provided was not well-formed or did not validate against our published schema

I have the correct permissions and can delete objects individually using delete_object.

Also, FYI, I noticed that attempting to use either delete_object or delete_objects when the bucket is in a different region, the operation fails with an unhelpful error message:

Error in enc2utf8(data) : argument is not a character vector

It took me a while to figure that one out.

Here is my current session info (though I have also tried with the dev version of paws):

> sessionInfo()
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] furrr_0.3.1     future_1.29.0   forcats_0.5.2   stringr_1.5.0   dplyr_1.0.10    purrr_1.0.1     readr_2.1.3    
 [8] tidyr_1.2.1     tibble_3.1.8    ggplot2_3.4.0   tidyverse_1.3.2 paws_0.1.12    

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9          lubridate_1.9.0     listenv_0.8.0       assertthat_0.2.1    digest_0.6.31       utf8_1.2.2         
 [7] parallelly_1.33.0   R6_2.5.1            paws.common_0.5.3   cellranger_1.1.0    backports_1.4.1     reprex_2.0.2       
[13] httr_1.4.4          pillar_1.8.1        rlang_1.0.6         googlesheets4_1.0.1 curl_5.0.0          readxl_1.4.1       
[19] rstudioapi_0.14     googledrive_2.0.0   munsell_0.5.0       broom_1.0.1         compiler_4.2.2      modelr_0.1.10      
[25] base64enc_0.1-3     pkgconfig_2.0.3     globals_0.16.2      tidyselect_1.2.0    codetools_0.2-18    fansi_1.0.3        
[31] crayon_1.5.2        tzdb_0.3.0          dbplyr_2.2.1        withr_2.5.0         grid_4.2.2          jsonlite_1.8.4     
[37] gtable_0.3.1        lifecycle_1.0.3     DBI_1.1.3           magrittr_2.0.3      scales_1.2.1        cli_3.6.0          
[43] stringi_1.7.8       fs_1.5.2            xml2_1.3.3          ellipsis_0.3.2      generics_0.1.3      vctrs_0.5.1        
[49] tools_4.2.2         glue_1.6.2          paws.storage_0.1.12 hms_1.1.2           rsconnect_0.8.28    parallel_4.2.2     
[55] timechange_0.1.1    colorspace_2.0-3    gargle_1.2.1        rvest_1.0.3         haven_2.5.1
DyfanJones commented 1 year ago

Hi @dleopold sorry about that. Have you tried to delete objects without Quiet parameter?

svc$delete_objects(
  Bucket = "...",
  Delete = list(
    Objects = list(
      list(
        Key = "..."
      ),
      list(
        Key = "..."
      )
    )
  )
)

Here is an example of it working in the up and coming s3fs package https://github.com/DyfanJones/s3fs/blob/main/R/s3filesystem_class.R#L349-L362 .

Interesting I didn't realise this:

Also, FYI, I noticed that attempting to use either delete_object or delete_objects when the bucket is in a different region, the operation fails with an unhelpful error message:

Error in enc2utf8(data) : argument is not a character vector

I will have to have a look why this is the case 🤔

dleopold commented 1 year ago

It does seem to work with the Quiet parameter removed. Not sure how I failed to try that. Thank you. I will close the issue since it is resolved for me, though maybe a quick update to the docs would prevent others from running into the same issue. Also, s3fs looks promising!

wlandau commented 1 year ago

I am hitting this issue too (working on https://github.com/ropensci/targets/issues/1171) and it would be great to get Quiet = TRUE to work. Quiet mode sends a smaller HTTP response, so it could perform faster.

DyfanJones commented 12 months ago

Happy to re-open this ticket.

DyfanJones commented 12 months ago

Ok I think I have fixed it:

remotes::install_github("dyfanjones/paws/paws.common", ref = "xml_build_flatten")
client <- paws.storage::s3()

bucket <- "mybucket"
key <- "removable.txt"

resp <- client$put_object(
  Bucket = bucket,
  Key = key,
  Body = charToRaw("dummy")
)

client$delete_objects(
  Bucket = bucket,
  Delete = list(
    Objects = list(
      list(Key = key)
    ),
    Quiet = T
  )
)
#> $Deleted
#> list()
#> 
#> $RequestCharged
#> character(0)
#> 
#> $Errors
#> list()

Created on 2023-11-08 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.2 (2023-10-31) #> os macOS Sonoma 14.0 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/London #> date 2023-11-08 #> pandoc 3.1.9 @ /opt/homebrew/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.3.0) #> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0) #> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.0) #> curl 5.1.0 2023-10-02 [1] CRAN (R 4.3.1) #> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.1) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.0) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.1) #> httr 1.4.7 2023-08-15 [1] CRAN (R 4.3.0) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> mime 0.12 2021-09-28 [1] CRAN (R 4.3.0) #> paws.common 0.6.3.9000 2023-11-08 [1] Github (dyfanjones/paws@a7aaf4d) #> paws.storage 0.5.0 2023-11-02 [1] local (/Users/dyfanjones/Documents/Packages/paws/cran/paws.storage) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.0) #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) #> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.0) #> vctrs 0.6.4 2023-10-12 [1] CRAN (R 4.3.1) #> withr 2.5.2 2023-10-30 [1] CRAN (R 4.3.1) #> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.1) #> xml2 1.3.5 2023-07-06 [1] CRAN (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0) #> #> [1] /Users/dyfanjones/Library/R/arm64/4.3/library #> [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

@wlandau @dleopold Please try it out and let me know :)

wlandau commented 12 months ago

Thanks so much, @DyfanJones! Works for me now!

dleopold commented 12 months ago

Works here. Thanks.

DyfanJones commented 12 months ago

paws.common 0.6.4 has been released to cran.