CatalogueOfLife / backend

Complete backend of COL ChecklistBank
Apache License 2.0
15 stars 11 forks source link

Consider Zenodo as a checklist data archive #1084

Open mdoering opened 2 years ago

mdoering commented 2 years ago

To make ChecklistBank a real repository for checklist data it needs to consider archival of data. Some users want to deposit their data in CLB as the primary copy. An archive of all data with optionally additional binary data, e.g. to bundle ColDP or DwC archives with original excel spreadsheets or PDF documents, should be stored immutably and be accessible with a stable identifier. We can use DOIs for this (following Zenodos versioning practice) and keep a copy of all imported archives at GBIF, similar to how GBIF handles crawled archives.

But we could also consider to push data archives to Zenodo, use their DOIs and always import their latest copy into ChecklistBank. CERN/Zenodo has a strong archival reputation and is a highly trusted organisation that should last for decades.

Data archival (at GBIF/COL or Zenodo) should be available for public external & released datasets.

mdoering commented 2 years ago

https://about.zenodo.org/policies/

Retention period: Items will be retained for the lifetime of the repository. This is currently the lifetime of the host laboratory CERN, which currently has an experimental programme defined for the next 20 years at least.

mdoering commented 2 years ago

https://developers.zenodo.org/#quickstart-upload