pulibrary / dspace-osti

Preparing PPPL dataset metadata for ingestion by OSTI
3 stars 1 forks source link

Fix Poster failure with withdrawn records #51

Closed astrochun closed 2 years ago

astrochun commented 2 years ago

Closes #49

This PR does two things:

  1. It adds a new method in Scraper, update_form_input, to update/create the form_input.tsv. Primarily, this remove DataSpace records where they have been withdrawn, which is primarily causing #49. Second, it also adds new entries. By addressing both record updates, this removes the need to manually update the TSV file, particularly with new entries.
  2. Second, with new records, it automatically updates the DOE metadata (sponsoring organizations, funding, and datatype) with default settings. This ensure that Poster will run successfully.

Note that this is toward having the ability for CI to scrape these data and update records, removing the need for manual involvements as much as possible.