Closed briri closed 5 months ago
We should make a pass to find the DMP ID itself in the DataCite system and check if it has relatedIdentifiers. No need for an admin to verify those connections
Sent an email to DataCite support. Their pagination cursor is not working. It is able to provide a start and end cursor as well as the total count of works. It errors though when requesting the current cursor. I have updated it to pull the first 500 works.
query affiliationQuery {
organization(id: https://ror.org/01an7q238) {
id
name
alternateName
works(query: "created: [2023-10-01 TO 2023-11-01]" first: 635, after: "MTY5NjE5NzQ0NDAwMCwxMC43OTIyL2cyMW43emdw") {
totalCount
pageInfo {
startCursor
endCursor
hasNextPage
}
edges {
cursor
}
nodes {
id
doi
type
}
}
}
Had to switch to use the DataCite REST API because we are working with older data so we do not have useful entry points in the GraphQL API (e.g. we have a ROR but the records we are after do not).
We found some success with the REST API. It found related works for about half of the uploaded DMPs. The solution is problematic though and we will need to keep working at it to find a balance.
What's wrong?
This all means that we are stuck with matching on a single PI name. It worked for this first round of testing but what about when we have a "Jane Smith"? We will end up with too many matches.
Need to update the old DataCite harvester code so that it: