cloudyr / aws.s3

Amazon Simple Storage Service (S3) API Client
https://cloud.r-project.org/package=aws.s3
381 stars 147 forks source link

get_bucket() pagination not working with DigitalOcean Spaces #393

Open pieterprovoost opened 3 years ago

pieterprovoost commented 3 years ago

get_bucket() does not work with DigitalOcean Spaces when there are more than 1000 objects (around 4000 in my case). Setting max at a value over 1000 results in duplicate objects, and setting max at Inf results in an infinite loop.

library("aws.s3")

## code goes here
objects <- get_bucket(
  bucket = "my-bucket",
  prefix = "pipeline",
  max = 10000
)

## session info for your system
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS  10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] aws.s3_0.3.22

loaded via a namespace (and not attached):
[1] httr_1.4.2          compiler_4.0.2      R6_2.5.0            tools_4.0.2         base64enc_0.1-3     curl_4.3            aws.signature_0.6.0 xml2_1.3.2          digest_0.6.27