ckreibich / scholar.py

A parser for Google Scholar, written in Python
2.1k stars 777 forks source link

Looking for a working fork #120

Open johnrsibert opened 5 years ago

johnrsibert commented 5 years ago

I'm having trouble using this software. I have looked through the issues and pull requests and see where the code has been patched, but the patches don't show up in the repo. As a result, I have been going through the pull requests and attempting to patch my own fork. It is a frustrating and error prone process. Perhaps I'm not understanding something about git.

Anyhow I could sure use some advice on how to get a fully-functional working version of scholar.py.

By way of background, I'm trying to analyse citations as a way of tracking the adoption of an open source statistical tool by different scientific disciplines. So far, I have had good success with Web of Science, but Google Scholar searches a bit deeper into the literature.

Thanks for helping, John

peterzjx commented 5 years ago

Hi @johnrsibert , I've been patching this repo on my fork. I haven't tested all the functions but I did fix a couple of issues. Also, a lot of problems comes with Google showing reCAPTCHAs to block bots, so it is suggested to use a cookie file exported from the browser.

ijmiller2 commented 4 years ago

Hi @peterzjx,

Does your fork handle the reCAPTCHAs? I've been thinking about tinkering with this a bit myself, but don't want to duplicate effort.

From some experience a while back, I think code in the etudier repo had handled the CAPTCHA events fairly well, but I haven't tried again recently. I seem to recall the last time I used scholar.py my ip got blacklisted by google almost immediately.

Or maybe I'm confusing that experience with scholary.py. Can't recall at the moment.

Best, Ian