MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
44 stars 22 forks source link

Provide getHttpsUrl() for Transfer Classes, option to use them in retrievePath messages. #1016

Open petersilva opened 5 months ago

petersilva commented 5 months ago

When publishing a product to a cloud resource, we often want to make the product as easy to download as possible. Having a cloud-provider specific url scheme for the download requires the client to install specific drivers for the cloud provider. That can be fine, but it is an additional burden. Cloud providers generally do provide raw https() end points which can be used by cloud oblivious tools such as wget.

It would be nice to have an option... something like:

retrieveRawHTTPs True|False.

If set to false, then the retrieval URL's are "azure://... " if set to true, then the retrieval URL's are "https://... "

so when the option is set, the consumer of the messages can download using wget, or any browser (anything that can consumer https) rather than needing an Azure (or S3) specific client.

One thing that I'm vague on... in all other contexts we split the URL into baseUrl and the rest. I'm wondering whether the a gethttpsUrl() entry should return a tuple of baseUrl and the rest also, or the more natural complete Url. If it returns the complete one, then the calling logic will need to split it up to obtain the baseUrl ...

I have thought of two ways of implementing this so far:

either method will work. Other suggestions welcome...

@gcglinton has agreed to work on implemeting the transformation to https. @petersilva can then use that to modify the internal sr3 logic.

This would be great/useful for S3 and Azure.

gcglinton commented 5 months ago

Implemented this in the new Azure transfer class from #1015.

Specifically, the gethttpsUrl method is here, and the ls() properties are set here.

It's likely not pretty, but it might work?

Would need to implement the desired functionality/method in the S3 driver as well.