datopian / ckanext-blob-storage

CKAN extension to offload blob storage to cloud storage providers (S3, GCS, Azure etc).
http://tech.datopian.com/blob-storage/
MIT License
14 stars 4 forks source link

Migration script from ckanext-cloudstorage to this storage #37

Closed rufuspollock closed 3 years ago

rufuspollock commented 3 years ago

A script to migrate resources can be written based on the CKAN API (+ Python SDK, as it will need access to Git LFS as well as CKAN), or by directly accessing the DB and Azure Blob Storage. The former is most likely preferrable and easier though the latter may be faster.

Acceptance

Tasks

This can easily be parallelized if we need to (e.g. if we need to run fast if the system is taken down) or run slowly in the background. It can also be restarted if needed and will continue from where it stopped.

shevron commented 3 years ago

Script has been merged in #40. Note that we do download and upload and not use any Azure APIs because we don't want to have anything Azure specific. We use CKAN's fallback download mechanism to download files, so whatever storage mechanism is behind it is abstracted away. Same goes for upload - ckanext-blob-storage has no knowledge or dependency on Azure.