Open JJediny opened 8 years ago
This also calls into the need to have a canonical source field to identify if the record on JKAN is externally or internally maintained
Interesting idea @JJediny. To be honest, I don't have much experience doing that (I couldn't get ckan harvester to work). I agree we should document it though. Would you be open to working on a page in the wiki about it?
(Also, regarding #62, just waiting to hear from you on the dataset slug question)
Part of the appeal of today's distributed content generation is the ability not to have to separately maintain metadata/datasets - when they are better maintained elsewhere by others. It's safe to assume that many use-cases that JKAN calls for will need/want to take a hybrid approach to catalog both a collection of datasets maintained on JKAN together with those from remote sources.
ckan db simple-dump-json FILE_PATH
orckan db simple-dump-json FILE_PATH
to export those records like this exampleHarvesting/snap-shoting datasets can be a version control nightmare, but its arguably better then recreating them entirely manually... However documentation could/should cover a few of the best options/processes out there to achieve the closest thing to syncing across multiple remote services. As an alternative/complementary approach it would also be good to include methods to integrate push notifications or webhooks to for example run a build and a gulp process to refresh a remote source and have a repeatable process to manage the fetch/ingest/transform/import process... this could then rebuilt a new docker container with jkan or run locally and commit the bulk updates