amchagas / open-hardware-supply

having a closer look on how OSH papers are evolving over time
MIT License
5 stars 2 forks source link

keywords found in references #15

Closed amchagas closed 1 year ago

amchagas commented 1 year ago

Some of the articles found by google scholar search find the keywords in the reference list only. This of course increases the number of false hits, and increases the list of articles that need to be checked.

A possible solution would be to get the list of articles currently as it is and use their URLs to get the papers themselves... although the URL themselves do not point to the pdf normaly, but rather the journal's page about the articles. Maybe using an API from things like unpaywall could help with at least the open access ones?

solstag commented 1 year ago

From the top of my head, I'm not sure we'd want to designate those cases as false positives to be automatically removed. I remember seeing papers that use a term for "OSH" that is not in our query, but reference an article that makes it clear they're thinking about OSH (in many cases it was your PLOS "have and have nots" paper).

amchagas commented 1 year ago

closing this as we have put code in place to systematically scan PDFs and see where the keywords come from in the text, allowing us to manually decide what to do with the paper