datopian / ckanext-blob-storage

CKAN extension to offload blob storage to cloud storage providers (S3, GCS, Azure etc).
http://tech.datopian.com/blob-storage/
MIT License
14 stars 6 forks source link

Move to storage layout of {dataset-uuid}/{sha256} #48

Closed rufuspollock closed 3 years ago

rufuspollock commented 3 years ago

Follow up to #45: the current blob storage approach has an issue when one moves a dataset from one organization to another (or the dataset is renamed). This is because we are storing data in blob storage at {org}/{dataset-name}/ and using the information when performing scope validation in giftless.

To avoid this, it is proposed to use <static-prefix>/<dataset-UUID> as the LFS prefix when storing new resources.

Tasks

Analysis

What's the problem

Imagine i want to download the blob related to a resource ...

Where this goes wrong is if i have moved the dataset ... because now the giftless location is still old dataset whilst scope is for new dataset ...

Options

Quick Fix: Scope Normalizer based Quick Fix - DONE in #47

Change from obj:myorg/myrepo/sha256:read to obj:*/*/sha256:read or even obj:sha256:read

Assumption: a scope normalizer function registered in ckanext-blob-storage for obj scopes can mangle requests for res:<org>/<dataset>/<sha256>:read to something like res:*/*/<sha256>.

If this is true, we can: