scholarly-python-package / scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
https://scholarly.readthedocs.io/
The Unlicense
1.37k stars 298 forks source link

Stuck in Proxy initialization #452

Closed pannone closed 1 year ago

pannone commented 1 year ago

Describe the bug I am trying to use Scholarly with ScraperAPI (the paid version). After passing the ScraperAPI key, scholarly stuck for few minutes and then the following error raises:

line 18, in GoogleParser
    scholarly.use_proxy(pg)
  File "D:\Programmi\Anaconda\lib\site-packages\scholarly\_scholarly.py", line 78, in use_proxy
    self.__nav.use_proxy(proxy_generator, secondary_proxy_generator)
  File "D:\Programmi\Anaconda\lib\site-packages\scholarly\_navigator.py", line 67, in use_proxy
    proxy_works = self.pm2.FreeProxies()
  File "D:\Programmi\Anaconda\lib\site-packages\scholarly\_proxy_generator.py", line 524, in FreeProxies
    proxy = self._proxy_gen(None)  # prime the generator
StopIteration

EDIT I have also tried the free proxy, and I still get the error.

To Reproduce I simply do this:


success = pg.ScraperAPI(my_key)

if success:

    print(f"{success}: Scraper API premium connection established")

    scholarly.use_proxy(pg)

Expected behavior After the proxy configuration, I should be able to use Scholarly.

Desktop (please complete the following information):

Do you plan on contributing?

arunkannawadi commented 1 year ago

scholarly attempts to also setup a free proxy for some queries that you don't really need ScraperAPI. However, it looks like when you tried this, no freeproxies were working. It should perhaps work now, but if it doesn't, then set

scholarly.use_proxy(pg, pg)

and it should work normally.

pannone commented 1 year ago

Thank you very much, now it is working! I have noticed something strange. If I run the code on google colab, I do not need to pass the proxy twice. However, if I try to run the example shown here, the citedby function returns always an empty list (both locally and on colab). Do I need to open another issue for this?

Thank you very much for answering!

arunkannawadi commented 1 year ago

I tried it locally with free proxy and it seemed to work fine. Did you setup any proxy before trying that example code?

pannone commented 1 year ago

I have tried both locally and in google colab, both with no proxy and with scraperapi startup plan, and it still gives me no citing articles. Are there some tries that you want me to do to figure the problem out?

arunkannawadi commented 1 year ago

I can reproduce your error now. Could you please open another issue for that? Thanks!