CDLUC3 / dmsp_aws_prototype

Sceptre CloudFormation templates for DMPHub v2
MIT License
1 stars 0 forks source link

Build scheduled Lambda function to search NIH/DOE for related works #60

Open briri opened 9 months ago

briri commented 9 months ago

Need to build a fairly generic lambda that can be used to schedule runs for our different API searches (aka ApiScheduler).

Consider having this Lambda fetch DMP IDs that are scan candidates

Then have this Lambda use EventBridge messaging to run scans async to a 2nd Lambda function (aka ApiScanner). This one will perform the following:

All of the above changes should be placed into the dmphub_modifications section of the record!

briri commented 9 months ago

Update the dmphub_modifications array so that it includes a confidence score as well as a note about how the match was determined (e.g. 'matched on grant_id and PI names')

@mariapraetzellis let me know what you think about the following

Here is preliminary logic for generating a confidence score:

Confidence levels:

Related Works (e.g. detected by DataCite EventData, OpenAlex, etc.):

Confidence         Match type
--------------------------------------------------------------------------------------------------
Auto               Grant ID match (in a scenario where the external system had record of the grant ID)
Auto               DMP ID match (in a scenario where the external system had record of the DMP ID)

High               1+ PI ORCID matches, funders match and title/abstract keywords match
High               1+ PI ROR matches, funders match and title/abstract keywords match
High               Funders match, repository match and title/abstract keywords match
High               1+ PI names match, repository match, output type match and title/abstract keywords match

Med                1+ PI names match and title/abstract keywords match
Med                Funders match and title/abstract keywords match
Med                one or more PI names match and title/abstract keywords match

Low                1+ PI names and title/abstract keywords match
Low                Funders match, repository match and title/abstract keywords match
Low                Funders match and title/abstract keywords match

Grant Information (e.g. detected by NIH Awards API, etc.):

Confidence         Match type
--------------------------------------------------------------------------------------------------
High                Funder match, opportunity number match and 1+ PI names match
High                Funder match, title exact match, 1+ PI names match
High                Funder match, 1+ PI names match, project start/end match and title/abstract keywords match

Med                Funder match, 1+ PI names match and title/abstract keywords match
Med                Funder match, 1+ PI names match and project start/end match
Med                Funder match, opportunity number match and title/abstract keywords match

Low                Funder match, 1+ PI names match and project start/end match
Low                Funder match, project start/end match and title/abstract keywords match
briri commented 6 months ago

The API functionality is there and being used by the new React UI. We just need to build a schedule-able Lambda that will call them to search for info. Will do that once we've identified a pattern for scanning that reduces the burden on the external API (see issues #66 and #77)