inspirehep / hepcrawl

Scrapy project for feeds into INSPIRE-HEP
http://inspirehep.net
Other
17 stars 30 forks source link

CrossRef spider #105

Open kaplun opened 7 years ago

kaplun commented 7 years ago

All publishers are pushing bibliographic data to CrossRef, and this data is freely available through CrossRef API: https://github.com/CrossRef/rest-api-doc/blob/master/rest_api.md

We should write a Spider that is able to transform CrossRef JSON format into our forma via: https://github.com/CrossRef/rest-api-doc/blob/master/api_format.md

Additionally we should import when available ORCIDs and References.

kaplun commented 7 years ago

cc @michamos, @StellaCh, @salmele

kaplun commented 7 years ago

Actually the most complete format is the XML one: https://support.crossref.org/hc/en-us/articles/213531266-UNIXSD-query-output-format which is more reliable than the JSON one.

salmele commented 6 years ago

Meanwhile there is a API Plus service at CrossRef

https://www.crossref.org/services/metadata-delivery/plus-service/

We should analyze this and discuss off-line on whether it is of interest