sul-dlss-deprecated / rialto

RIALTO - Stanford Libraries' Research Intelligence System
https://library.stanford.edu/projects/rialto
5 stars 1 forks source link

SeRA grant data doesn't match the MAP. #305

Closed jcoyne closed 6 years ago

jcoyne commented 6 years ago

We have 21,847 grants imported into Neptune, but only 30 of them have Principal Investigators. The data model says a grant will have at least one PI: https://github.com/sul-dlss/rialto/wiki/RIALTO-Data-Models-&-Profiles#grants

Should we change the data model (and rialto-derivatives) to not require at least one PI?

cc @cmh2166 @peetucket

jcoyne commented 6 years ago

It looks like this commit https://github.com/sul-dlss-labs/rialto-entity-resolver/commit/52923f67a98abaa5120fd788601c312ad6d04938 didn't get deployed until after we transformed grants.

peetucket commented 6 years ago

So does this mean they should resolve? I'm looking at the spreadsheet of SeRA data I was given last year (presumably similar to the data coming from the API) and I'm seeing pi_employee_id and pi_sunet_id in nearly every row

jcoyne commented 6 years ago

Yeah, I think the problem here is with the entity resolver and etl loading. I'll open a ticket to that effect.