An experiment in gathering together sources of information about digital preservation practices
This initial plan is to experiment with using DVC to gather useful information sources, starting with iPres. Then see if this can usefully be transformed into something searchable using Datasette or Datasette Lite.
Why DVC? TBA but I like the way it handles checking data dependencies. Very #DigiPres... Also, e.g. https://dvc.org/doc/user-guide/data-management/remote-storage/google-drive
Clone this repo. Set up a Python 3 virtual env, e.g.
python3 -m venv .venv
source .venv/bin/activate
Install dependencies:
pip install .
Pull the derived data:
dvc pull
Run the repro chain:
dvc repro
Try the Datasette:
datasette serve practice.db
After which you should be able to go to e.g. http://127.0.0.1:8001/practice/publications?_facet=type&_searchmode=raw&_facet=year&_facet_array=creators&_facet_array=institutions&_facet_size=10&_sort=year
Where are the papers and metadata... Links on https://iPRES-conference.org/ are not complete.
It may make more sense to use JSON to store this data, and use JSON Schema in VSCode to make it easer to edit them. That can then be consumed by the gathering scripts as well as being used to generate tabular forms like this.
The information about each iPRES conference is now stored as a set of Markdown+metadata files in the publications
repository, and are summarised at http://www.digipres.org/publications/ipres/