This is a harvester that can automatically collect and submit works to an InvenioRDM repository. It currently works with the CaltechAUTHORS repository and looks at CrossRef and ORCID.
Currently harvesting:
- CrossRef by ROR
- ORCID
- CrossRef DOIs
The harvests are typically run through GitHub actions but could also be run on the command line.
You need to have a CaltechAUTHORS token available in the environment variable
RDMTOK
. For a CrossRef ROR harvest type
python harvest.py crossref
You can harvest a specific DOI with
python harvest.py -doi 10.7717/peerj-cs.1023
For an ORCID harvest type:
python harvest.py orcid -orcid 0000-0001-9266-5146
For all harvests there is an -actor
flag, which gets included in the message when the record is added to the queue.
For command line use you need the latest version of irdmtools
installed:
curl https://caltechlibrary.github.io/irdmtools/installer.sh | sh
Then install the python requirements with
pip install -r requirements.txt
While this approach should work for any InvenioRDM repository, it has only been tested on CaltechAUTHORS. If you're interested in using this with a different repository reach out as we would be happy to make it a bit more flexible.
Publishers use a wide variety of urls for licenses. We are currently adding variants to the license.csv file, which is a custom file that connects urls to the InvenioRDM license names. It is almost certainly incomplete.
Open an issue in the issue tab.
Pull requests are appreciated.
Software produced by the Caltech Library is Copyright © 2022 California Institute of Technology. This software is freely distributed under a BSD-style license. Please see the LICENSE file for more information.
GitHub action created by Tom Morrell. Robert Doiel and Tom Morrell wrote the source irdmtools package.
This work was funded by the California Institute of Technology Library.