CDLUC3 / dmsp_aws_prototype

Sceptre CloudFormation templates for DMPHub v2
MIT License
1 stars 0 forks source link

Update Citation Lambda to call DatCite/Crossref #66

Open briri opened 9 months ago

briri commented 9 months ago

Update the Citation Lambda so that it 1st contacts DataCite and Crossref APIs in an attempt to fetch metadata. If unsuccessful, parse the BibTex citation XML.

We want to return:

briri commented 6 months ago

Built out a Lambda that calls DataCite and processes the results to find related works for a given DMP. The overall workflow is flawed and inefficient though.

We call DataCite for the funder NIH for 2022 for example for each relevant DMP. This approach results in too many calls to DataCite (which is slow to respond).

One interesting observation I made that will impact our final implementation was that the GraphQL API would timeout on the DataCite end if we included too many ORCIDs in the query.

It would be more efficient to follow the pattern we use for the COKI Observatory Network which we will run once a week (see issue #77) .

briri commented 6 months ago

We should update the Dynamo table records so that all augmentations sit within a record adjacent to the DMP ID record. Something like:

"PK": "DMP#doi.org/11.22222/A12B3c4D",
"SK": "AUGMENTATIONS"

"AUGMENTERS#datacite": {
  "last_checked": "(Timestamp of the last time the augmenter looked for matches for this DMP)",
  "modifications": "(The JSON array that we currently are storing on the DMP ID record in the 'dmphub_modifications' field"
}