brokamp-group / s3

More efficient downloading and usage of files hosted on AWS S3
http://brokamp-group.github.io/s3/
Other
3 stars 2 forks source link

better checking for file existance, access to file, etc; closes #11 #13

Closed erikarasnick closed 3 years ago

erikarasnick commented 3 years ago

accidentally commited some changes to master before this

cole-brokamp commented 3 years ago

I like the s3_check_file function, but I've separated it into two different functions so that we can use them more easily within the other functions. They each return a logical, but will also emit messages if wanted too.

I haven't yet updated the s3_get_files function to utilize these functions (but I did update s3_get. Wanted to get it up here in case you want to work on it. I want to work on it later today and hopefully will get back to it soon.

cole-brokamp commented 3 years ago

Also, I'm still having problems with testing for the private downloads 🤷‍♂️ . We will get back to that soon.

erikarasnick commented 3 years ago

If I have fake credentials set, I can't even download a public file. When I run boto$client("s3")$head_object, I get this error for both public and private:

Error in py_call_impl(callable, dots$args, dots$keywords) : ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Detailed traceback: File "/Users/RASV5G/Library/r-miniconda/envs/r-s3/lib/python3.9/site-packages/botocore/client.py", line 357, in _api_call return self._make_api_call(operation_name, kwargs) File "/Users/RASV5G/Library/r-miniconda/envs/r-s3/lib/python3.9/site-packages/botocore/client.py", line 676, in _make_api_call raise error_class(parsed_response, operation_name)

I don't think boto head_object function works at all if you don't have credentials.

cole-brokamp commented 3 years ago

You're right. We should never be using the boto methods without credentials. I think public files still require valid aws credentials to download through boto, so we use httr stuff instead. I think it would be right to get an error when trying to download a public file if you have invalid credentials.

We could write a fallback so that if a file errors to download when using credentials, it also attempts to download it via the public / "httr" stuff, but I'm not sure how often that case would come up.

erikarasnick commented 3 years ago

Oh, right. Got too deep in checking scenarios haha. I was mostly thinking if a user was trying to download private files but their aws account hadn't been given access to the files for some reason.

I also don't get an error now when I run the last test with the "fake" credentials.

cole-brokamp commented 3 years ago

That's a good point. I'm not sure if there is a way to differentiate between "this file doesn't exist" and "you don't have permission to access this file". We might have to be very careful when we do use private data in a DeGAUSS container by sending the user a message saying that it could fail if they don't have proper AWS permissions set.

erikarasnick commented 3 years ago

That would work. Another maybe not very probable scenario... Do you think we would ever have the case where someone without credentials would be trying to download multiple files with s3_get_files and some were private and some not?

cole-brokamp commented 3 years ago

What command do you use to render the README.Rmd into a README.md? I'm getting some weird output with the cli icons when I try to do this.

erikarasnick commented 3 years ago

I usually just use the knit button in Rstudio.

cole-brokamp commented 3 years ago

I'm not in R Studio, but its showing weird unicode characters for me when I try rmarkdown::render() or knitr::knit(). Can you push up the rendered README here and I will merge and do a new release?