okfn / ckanext-s3filestore

Use Amazon S3 as a filestore for CKAN
GNU Affero General Public License v3.0
14 stars 34 forks source link

No tests for private datasets? #1

Open pwalsh opened 9 years ago

pwalsh commented 9 years ago

Does this work with private datasets? Meaning, URLs are not accessible if a dataset they belong to is private?

brew commented 9 years ago

All resource files hosted by S3 require authorisation to be downloaded, regardless of whether the dataset is private or public on CKAN. When a user requests a file (eg from a resource page), the CKAN app checks that the requesting user has the correct authorisation to view the dataset and its resources, then requests the file from S3 with the correct authorisation.

The end user never sees the S3 url, but if they did, access would be denied without the correct authorisation in the request.

Group image files are different. The are hosted publicly and linked directly to S3.

MrkGrgsn commented 8 years ago

@brew I understand that to support access control, that the file is essentially proxied by CKAN to the user and I wonder if doing so has sacrificed the capability to handle properly large files, e.g., GB+ (as well as hobbling S3's distributed network)

My initial reading of the code is that the files are retrieved from S3 into memory by CKAN and then served to the user, which would limit max filesize by available memory. Can you provide any details about this?

How to best utliise S3 for resource storage with access control on large resources is a topic we have discussed a few times at Link and our ideas revolved around using temporary URLs for private datasets or some more integrated access control via AWS's IAM. Any thoughts?

brew commented 8 years ago

@MrkGrgsn

Check out the work @TkTech has been doing on an alternative cloud extension that implements secure urls, redirecting requests to the cloud repository: https://github.com/open-data/ckanext-cloudstorage

TkTech commented 8 years ago

@MrkGrgsn Feel free to reach out to me anytime if you have any questions regarding ckanext-cloudstorage.

MrkGrgsn commented 8 years ago

Is development on ckanext-s3filestore being stopped in favour of ckanext-cloudstorage then?

MrkGrgsn commented 8 years ago

... and thanks :)

brew commented 8 years ago

@MrkGrgsn s3filestore was originally developed for our own internal needs for our cloud offering. We may end up switching to ckanext-cloudstorage at some point, so development efforts can be focussed on one project, but no decision has been made yet. I wouldn't say development for s3filestore has been discontinued.