caltechlibrary / cold

Controlled Object Lists Daemon
Other
1 stars 0 forks source link

Support journal, publisher, and ISSN vocabulary #30

Open tmorrell opened 1 year ago

tmorrell commented 1 year ago

We need to store a vocabulary of journal name, publisher, and ISSN. Current automation to get this data into InvenioRDM is https://github.com/caltechlibrary/irdmtools/blob/main/processjournals.py. The raw source is https://github.com/caltechlibrary/ames/blob/main/journal-names.tsv, and the cleaned resulting vocabulary is https://github.com/caltechlibrary/caltechauthors/blob/main/app_data/vocabularies/caltech_journals.jsonl

We need to be able to handle cleanup of journals that have multiple names or multiple publishers. COLD should have some way of marking records as incomplete, so users know they need manual intervention and they don't make their way to InvenioRDM.

When a new journal and ISSN is added to CaltechAUTHORS, it should be added to COLD as an incomplete record and library users should receive an email.

rsdoiel commented 1 year ago

I looked at where I left off with cold on Friday. It is time to re-implement my original prototype. Newt, Postgres, PostgREST and Pandoc will let me define rich data models so this should be no problem moving forward.