CWWhitney / hosyana_review

Review of decision supporting holistic modeling methods
MIT License
0 stars 0 forks source link

Google Scholar and our own ScholarSearchAnalyzer?? #21

Closed CWWhitney closed 9 months ago

CWWhitney commented 9 months ago

This is quite the wrangling task...

How about creating a GitHub repo for a project - something like "ScholarSearchAnalyzer"? This repository could include scripts for scraping and cleaning the papers, analyzing relevance, and addressing the issues (see below).

In 2020 there were 1,320 results in the scholar search I got 1300 and found that just 965 were relevant.

https://docs.google.com/spreadsheets/d/1Wz31b_jCWY_bAhFaZoCsERhvjUNxoH9npZ2MNhn0XD8/edit#gid=1791116862

On issues, I made it to 70 and the results seem to have re-ordered themselves. Another issue I seem to have run into is that the results only go to page 100 but at 10 pages each this is not the expected 1,300 results for 2020

We can also add documentation for others who might want to use or contribute to our project.

CWWhitney commented 9 months ago

Others have tried.

https://claudiu.psychlab.eu/post/automated-systematic-literature-search-with-r-google-scholar-web-scraping/

https://www.zenrows.com/blog/web-scraping-r

https://serpapi.com/google-scholar-api

We need the pdfs