ckan / ideas

[DEPRECATED] Use the main CKAN repo Discussions instead:
https://github.com/ckan/ckan/discussions
40 stars 2 forks source link

Version control for datasets #229

Open hayley-leblanc opened 4 years ago

hayley-leblanc commented 4 years ago

Is there any work being done on some kind of version control for datasets? I know that there's an extension that provides some of this functionality, and a way to view differences between metadata of different versions was added to the CKAN core a few months ago by @davidread. However, there isn't currently a way to view old versions of resources, or to compare different versions of resource files, or to revert to old versions of datasets, all of which could be really useful.

davidread commented 4 years ago

Storing old copies of the CSV files (and other formats) would be nice.

ckanext-archiver goes some way towards this, regularly downloading from the resource URL, but the saved copies are not indexed and made available. So this might be a reasonable place to start if you're going to work on it.

pwalsh commented 4 years ago

We are working on version control @datopian for 2 major clients right now. Code is currently at POC stage but will of course be open source. It would be awesome to talk about your use cases @hayley-leblanc There are lots of details across what we might refer to as "versioning" and also what we might refer to as "revisioning", and the types of interactions users want to have with versioned datasets.