Closed kenahoo closed 4 years ago
Please don't file two issues in one. Note that files
is the list of files you want to synchronize, but you specified a directory - this is likely cause for your second issue (you probably intended dir("localdir",recursive=TRUE)
). Also bucket
is the name of the bucket - not a URL, so the bucket doesn't exist hence s3sync
tries to create bucket with the name "s3://landsat-pds/test" it which fails due to region mismatch (for which s3sync
should pass-through ...
which it doesn't). s3sync
currently doesn't support syncing into a different subdirectory of the bucket.
I think one of us might be misunderstanding. I'm doing a download, not an upload, so it's never creating any buckets. It seems like most of your response thinks I'm doing an upload, or two-way sync?
Also, it does seem to read all the proper data from S3, so I felt like the bucket
argument was supplied correctly, though of course the URL is not a "bucket".
Taking a step back, though - to sync two directories, I have to enumerate all the files in the directories myself? Doesn't this defeat the point of syncing? How would I know what all the files are before I sync?
Or maybe this function isn't doing a similar thing to aws s3 sync
? If not, I wonder whether a different name might be better.
It doesn't matter - the old code created a bucket if it didn't exist no matter whether you use download or upload. And like I said - the docs were clearly saying that you supply the list of files and a bucket name, not URL. That doesn't mean it made sense, it's how it was written. I re-wrote it yesterday so check the new docs and code and feel free to file new issues, but please make sure you file issues against documented behavior, not your expectations. You can also file enhancement requests if you think there is a better way.
I'm trying to implement the equivalent of a command like
aws s3 sync s3://landsat-pds/test test
using theaws.s3
package.First issue
The first problem is some kind of protocol issue:
Notice that it's trying to do a
PUT
request - that seems bad, when I'm only trying to download?This is a publicly available dataset (which I did not create, I found it at https://registry.opendata.aws/landsat-8/), so I think it should work without credentials.
Second issue
Secondly, when I solve the above issue (by using a private bucket and specifying credentials explicitly), there seem to be path problems:
Notice that it did connect successfully to the bucket and read the data from it, but it's trying to write in a local directory called
ForecastDataSets/
, when I'm trying to sync with thelocaldir/
directory.Workaround
Can do
system("aws --profile dev-creds s3 sync s3://my-dev-bucket/test/2020-03-08_Test localdir")
and avoid using theaws.s3
package for syncing.Session info: