TkTech / ckanext-cloudstorage

Implements support for resource storage against multiple popular providers via apache-libcloud (S3, Azure Storage, etc...)
MIT License
35 stars 55 forks source link

Add Cloudfront signed URL support for S3 #8

Closed jqnatividad closed 7 years ago

jqnatividad commented 7 years ago

Since boto is already included (#1), consider including Cloudfront signed URL support.

Using boto's create_signed_url, the publisher can ensure that users visit the data portal to download the file, rather than just saving the S3 link:

TkTech commented 7 years ago

Hello @jqnatividad I might be missing something here, but ckanext-cloudstorage has supported secure urls for AWS and Azure since the first commit.

MrkGrgsn commented 7 years ago

@TkTech has a point but I think the key difference in what @jqnatividad is suggesting is using Cloudfront rather than S3, but whether that is actually a bonus will depend on a repository's download volumes and the geographic distribution of downloads.

jqnatividad commented 7 years ago

Hi @TkTech, the context of this request is that we have a user with a lot of large high-value files that they plan to distribute on their CKAN site as @MrkGrgsn suspected.

They don't want to be penalized for success when they release these files in terms of bandwidth overages, so we've formulated a multi-prong strategy to give them a better handle on their bandwidth costs. This includes:

TkTech commented 7 years ago

Hello @jqnatividad,

f you want to expose this through CloudFront instead you can subclass the Storage class to add your custom get_url_from_filename() logic after setting up your Distribution on the AWS CloudFront console. [configuration, client-specific]

If you want to expose torrent links instead you just need to update your template (ex: <a href="{{ storage.get_url_from_filename(...) }}.torrent">...</a>) since just adding .torrent to the URL will give you a torrent link. [template]

9 is a handled via one-time configuration of your S3 bucket from the AWS console, not ckanext-cloudstorage. If you were thinking of having it replicated to two buckets (one internal, one requester-pays) this is best handled by a custom background job, not the uploader. [configuration, client-specific]

--

Just to summarize, everything you want to do is best done through your client-specific extension and the AWS console, not directly in ckanext-cloudstorage.

ckanext-cloudstorage is intended to support basic CRUD functionality across a significant number of providers in a portable manner more than exposing all possible S3/AWS features, which is best left to your client extension.

jqnatividad commented 7 years ago

Thanks for your expert guidance @TkTech!

Will be sure to implement it per your comment above and share with you and the community our findings with our client-specific extension.