Open stefandesu opened 3 years ago
There are two use cases of data dumps:
Maybe split the use case and first solve 1 by providing simple dump files.
At the backend versioning could be supported by versioning in JSKOS-Server but this might be too complex, so just using git might be a good option and one file per record allows for better querying .
Script bin/dump.js
contains some versioning capabilities but only at the command line.
I see that we are doing daily dumps of all the data in our main instance, but as far as I can see, the data is not easily browsable and only the latest dump is linked on the site (https://bartoc.org/data/dumps/latest.ndjson).
Should we maybe have a separate Git repository that tracks the latest dump so that we can use the Git history to refer to older versions of the dump? Not sure if it should be all vocabularies in one file like the current dump or one file per vocabulary (which would allow more granular tracking of changes, but we'd have a ton of files).
What do you think?