digipres / digipres-practice-index

An experiment in gathering together sources of information about digital preservation practices
GNU Affero General Public License v3.0
2 stars 1 forks source link

digipres-practice-index

An experiment in gathering together sources of information about digital preservation practices

This initial plan is to experiment with using DVC to gather useful information sources, starting with iPres. Then see if this can usefully be transformed into something searchable using Datasette or Datasette Lite.

Why DVC? TBA but I like the way it handles checking data dependencies. Very #DigiPres... Also, e.g. https://dvc.org/doc/user-guide/data-management/remote-storage/google-drive

Development Setup

Clone this repo. Set up a Python 3 virtual env, e.g.

python3 -m venv .venv
source .venv/bin/activate

Install dependencies:

pip install .

Pull the derived data:

dvc pull

Local Usage

Run the repro chain:

dvc repro

Try the Datasette:

datasette serve practice.db

After which you should be able to go to e.g. http://127.0.0.1:8001/practice/publications?_facet=type&_searchmode=raw&_facet=year&_facet_array=creators&_facet_array=institutions&_facet_size=10&_sort=year

Sources of Practice

iPRES

Where are the papers and metadata... Links on https://iPRES-conference.org/ are not complete.

It may make more sense to use JSON to store this data, and use JSON Schema in VSCode to make it easer to edit them. That can then be consumed by the gathering scripts as well as being used to generate tabular forms like this.

The information about each iPRES conference is now stored as a set of Markdown+metadata files in the publications repository, and are summarised at http://www.digipres.org/publications/ipres/

PHAIDRA

OSF