Open FlorisCalkoen opened 9 months ago
@FlorisCalkoen I just made a custom Downloader like this, but for Google Cloud Storage. If it's useful to you, I linked it in a comment on a similar Issue thread (#363).
Hi @remrama @FlorisCalkoen @WesleyTheGeolien I have 0 experience with cloud containers but since multiple people have requested this than we can look into it.
As @remrama said, this would be best implemented as a downloader. It could take the token as input but could also take a name of an environment variable and do the reading for you.
From what I gather, each cloud would have their own API for fetching the data so they'd need separate implementations. Since Pooch is supposed to be a very lightweight dependency for other projects, any downloader that requires a new dependency would have to make that dependency optional. We already do this for SFTP for example.
I'll edit this issue and #363 to make them explicitly about AWS and Azure. @remrama would you mind opening a new one for Google Cloud Storage and include the link to your code?
If either of you would like to implement this, then it would be great! We'd need:
GCSDownloader
, AWSDownloader
, AzureDownloader
) in pooch/downloaders.py
(see https://www.fatiando.org/pooch/latest/downloaders.html and the existing downloaders). Make sure to add it to the choose_downloader
function so that Pooch can automatically find it based on the prefix (az:
etc).data
folder uploaded to the storage so we can test that it works.pooch/tests/test_downloaders.py
that check if the download works and that any errors that should be raised are actually raised.Not sure what the pricing model is for these providers (which is why I never bothered with them) but if it's not possible to have our test data on them so that we can very the functionality then I think it's best to leave the downloader outside of Pooch itself.
Add a
AzureDownloader
that can fetch the data from Azure cloud storage. It should support an authentication token, ideally with the option to read it from an environment variable.Description of the desired feature:
Would it be possible to add support for fetching data from private cloud containers?
Are you willing to help implement and maintain this feature? Maybe, yes!