soilwise-he / harvesters

MIT License
0 stars 0 forks source link

fetch metadata from DOI's which are not in DataCite #8

Open pvgenuchten opened 3 months ago

pvgenuchten commented 3 months ago

cordis can return a large number of DOI's, which are project deliverables, mostly reports and scientific articles, some datasets. For those deliverables that have a DOI, it is usually possible to fetch metadata for that DOI from Datacite API. While testing the combination Cordis and Datacite I noticed that many DOI's are not available in Datacite. These are mostly DOI's from scientific magazines, such as Elsivier, Springer, Nature.

Which made me wonder, what is the best way to retrieve metadata from these DOI's?

Maybe OpenAire is the answer here, query the openAire API for the specific DOI? Alternatively a library service such as Scopus?

Interesting is to see that most commercial library providers implement google tagmanager, which advertises the most elaborate metadata via the window.datalayer

Some questions:

pvgenuchten commented 3 months ago

OpenAire seems to not include some of the articles returned by cordis:

But others, which are not in Datacite, are actually in OpenAire

Advantage of the OpenAire API is that you can query a list of DOI's in a single request

pvgenuchten commented 1 month ago

If a resource is not in OpenAire, the minimal metadata can be retrieved from the DOI registry itself

send a request to the DOI using accept header ''

pvgenuchten commented 1 month ago

I found another initiative, crossref.org is an initiative which lists many more resources then datacite. for example the failed url's above actually work fine on crossref

https://api.crossref.org/works/2010.1002/hyp.11434 https://api.crossref.org/works/10.1016/j.catena.2020.104511

crossref actually provides a full listing of citations... very interesting to understand how resources link together

pvgenuchten commented 2 weeks ago

DOI metadata is imported from OpenAire into a postgres database, for every DOI extracted from Cordis. Some DOI records do not resolve in OpenAire.