Open jb44 opened 6 years ago
Hey @jb44 glad you've found this repo useful and thanks for noting the discrepancy. Sorry for the slow response, I was hiking in the High Sierras.
We did not do any processing of the keyword field. In fact I was not even aware there was a keyword field. So I am guessing something at Crossref changed and resulted in fewer articles having keywords? Perhaps @gbilder or someone else from @crossref may know.
Many thanks for this work - we used it for a time before crossref started offering their own data dumps.
We did notice a small discrepancy in the keywords between this dump and the crossref dump. The dump here appears to have much better keyword coverage for papers (over 90%) while the current crossref dump has less than 5% coverage.
Where could I look to understand the methodology you used to gather keywords for this dump? We may need to use that methodology to augment what the crossref dump offers.
Thanks