This repository acts as the central management point for a set of repositories that are used to generate digital object identifiers (DOIs) for datasets in the Neotoma Paleoecology Database.
DOIs are generated at the level of a dataset, which in Neotoma consists of all measurements of a given data type for a single collection unit at a site (e.g. all vertebrate fossils from a bone pile in a cave; all fossil pollen samples from a core in a lake; etc.) All DOIs are associated with a landing page.
Linked repositories include:
This project is currently under development. All participants are expected to follow the code of conduct for this project.
NOTE: The DataCite XML validation files in the data/
folder (and include
subfolder) were obtained from the DataCite GitHub Schema repository.
For any single dataset, the DOI provides access to three related elements:
The live record lives as the relationship between elements in the database, linked to the datasets
table. Thus, the live record can change over time, as taxonomies or linked chronologies change.
The frozen record is generated within a week of dataset submission. It represents the state of the record at the time of upload. This version supports journal requirements for data submissions and aligns with data-management best practices. The frozen record lives in the doi
schema of the database and is stored as a (Postgres) jsonb
data type, along with the datasetid
, the date created and date modified (if neccessary
The DOI metadata is stored with DataCite and is generated from a script in this repository. When a new DOI is minted the DOI and related datasetid is added to the datasetdoi
table.
A Neotoma data steward uploads a dataset to Neotoma (Tilia -> Tilia API -> NeotomaDB)
Chron job running in data-dev
checks for all records generated at least one week ago, without a "frozen" version (query in the neotoma_doi repository)
doi.frozen
in the database.The PI of record can contact the steward to update the metadata (or a token can be generated to allow the PI to update things?)
The same chron job in #2 will identify records where the ndb.dataset
entry is older than 14 days, the dataset has an entry in doi.frozen
and no entry in ndb.datasetdoi
. This assumes that PIs and stewards have had an opportunity to revise their datasets.
UPDATE
the frozen dataset using doi.doifreeze()
.assign_doi()
to build the DataCite XML file, and post the DOI metadataThis work has been supported by grants from the National Science Foundation: NSF-1541002, NSF-1550855 and NSF-1550707.