greenelab / crossref

Download metadata for all DOIs using the Crossref API
https://doi.org/b48h
Creative Commons Zero v1.0 Universal
59 stars 10 forks source link

Question on keyword coverage #10

Open jb44 opened 6 years ago

jb44 commented 6 years ago

Many thanks for this work - we used it for a time before crossref started offering their own data dumps.

We did notice a small discrepancy in the keywords between this dump and the crossref dump. The dump here appears to have much better keyword coverage for papers (over 90%) while the current crossref dump has less than 5% coverage.

Where could I look to understand the methodology you used to gather keywords for this dump? We may need to use that methodology to augment what the crossref dump offers.

Thanks

dhimmel commented 6 years ago

Hey @jb44 glad you've found this repo useful and thanks for noting the discrepancy. Sorry for the slow response, I was hiking in the High Sierras.

We did not do any processing of the keyword field. In fact I was not even aware there was a keyword field. So I am guessing something at Crossref changed and resulted in fewer articles having keywords? Perhaps @gbilder or someone else from @crossref may know.