CLARIAH / clariah-plus

This is the project planning repository for the CLARIAH-PLUS project. It groups all technical documents and discussions pertaining to CLARIAH-PLUS in a central place and should facilitate findability, transparency and project planning, for the project as a whole.
9 stars 6 forks source link

Preserve full version history in tool discovery #131

Open proycon opened 1 year ago

proycon commented 1 year ago

Currently our tool discovery harvesting pipeline and the resulting tool store present one version of the software, i.e. the latest stable release (or in absence of any formal releases, just the latest git master/main version). This latest version always receives the URI https://tools.clariah.nl/$identifier. This means the metadata is subject to change at any harvesting round. Any previous versions are currently 'lost' to the tool store.

I'm now considering whether we should explicitly harvest and keep all old software metadata versions. This is also in line with what solutions such as eScience's Research Software Directory are doing and important from a reproducibility/citability perspective. Each version would get a more persistent identifier https://tools.clariah.nl/$identifier-$version and the full history of versions can then be viewed in the tool store (if explicitly requested, the default index would still only present the latest stable release).

We could either actively harvest all historic versions, or take the easier route and simply retain and archive previous harvests.

Ideas and comments welcome.

proycon commented 1 year ago

This has been partially implemented already now, all tools receive a version-specific URI (and a generic one that points to the latest version). Keeping historic versions has not been implemented yet though.