jimmejardine / qiqqa-open-source

The open-sourced version of the award-winning Qiqqa research management tool for Windows
GNU General Public License v3.0
381 stars 65 forks source link

503 error (access denied) for Google Scholar, i.e. how to deal with "Please show you're not a robot" #113

Open GerHobbelt opened 5 years ago

GerHobbelt commented 5 years ago

Possibly relevant info from another tool which uses Scholar as well: https://github.com/ckreibich/scholar.py/issues/66

Related issue: #2

GerHobbelt commented 5 years ago

Related material from others / same type of trouble for Zotero et al:

GerHobbelt commented 5 years ago

Analysis for Qiqqa: this depends on #2, where we should change to using Chrome+CefSharp so that the Google "I am not a bot" Captchas will be working again.

GerHobbelt commented 4 years ago

There's an additional work-around that I didn't know before: in Scholar, click on the "cite" link (often shown as a very large double-quote icon below the item, next to links to "related articles", etc.) and then a popup is shown in the page where you can view a few different citation formats **and below those you'll see a couple of links to 'download' the citation info in various formats, the first of which is: BibTeX. 🥳 yay!

Source: https://texblog.org/2014/04/22/using-google-scholar-to-download-bibtex-citations/

Mocabl3nd commented 4 years ago

I'm also experiencing this issue. I cant still get pass through the captcha :(

GerHobbelt commented 4 years ago

This makes it a tough nut as it's dependent on #2, which is still some way away (= adapting Qiqqa to use a totally different embedded browser than the old fireFox (a.k.a. XULrunner) in there)

Have you tried running one of the latest experimental v82* releases? I cannot guarantee that the captcha problem will be gone (I get it myself at the oddest times), but at least I don't suffer from it most of the time; automatic BibTeX access is a severe problem on my own machine (for which the work-around posted above is some help at least), so I wonder what the difference in our setups really is that makes Google even more aggressive on your box.

Please check Qiqqa version and if it's not a recent v82, fetch the latest and try that one for a while. If you're not satisfied, you can always "downgrade" back to your old version by re-installing that one: any Qiqqa installer will replace the Qiqqa software already on your machine, while user configuration will be kept intact.

Mocabl3nd commented 4 years ago

Hi! I was able to find a fix for this issue, sort of... haha.  I tried to use my default browser (Google chrome) to search for the Bibtex info for my journal articles because of the captcha problem in Bibtex sniffer. One problem I encountered is that I can't also access the Bibtex file of some articles in Google scholar using my browser. it shows error 403 (Please see photo below). I don't know why it occurs. 

So I did a bit of a research to know why this happens. I learned that it has something to do with the IP address. So because of this, I thought of using VPN. Then, IT WORKED!  I was able to retrieve the Bibtex file both from my browser and in BibTeX sniffer. But the downside is, whenever I turn off my VPN, I can't access the Bibtex file from google scholar. it shows that error 403 again.  At least I was able to find a temp solution for this. I hope this information may help you to fix this issue.  Warm regards, Monica On Tuesday, April 21, 2020, 06:39:11 AM GMT+8, Ger Hobbelt notifications@github.com wrote:

This makes it a tough nut as it's dependent on #2, which is still some way away (= adapting Qiqqa to use a totally different embedded browser than the old fireFox (a.k.a. XULrunner) in there)

Have you tried running one of the latest experimental v82* releases? I cannot guarantee that the captcha problem will be gone (I get it myself at the oddest times), but at least I don't suffer from it most of the time; automatic BibTeX access is a severe problem on my own machine (for which the work-around posted above is some help at least), so I wonder what the difference in our setups really is that makes Google even more aggressive on your box.

Please check Qiqqa version and if it's not a recent v82, fetch the latest and try that one for a while. If you're not satisfied, you can always "downgrade" back to your old version by re-installing that one: any Qiqqa installer will replace the Qiqqa software already on your machine, while user configuration will be kept intact.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Leepee commented 4 years ago

As another workaround for this, you can login to the browser, and Scholar allows you a few more cracks at it, to get some more docs cited. If you have more issues, you can also "star" the document, and then go to "My Library" and cite from there. That's what I did and managed to cite 50 something articles just now.

Mocabl3nd commented 4 years ago

This is noted. I'll try this one, for sure. Thank you for the tip! Warm regards,Monica  On Monday, July 20, 2020, 09:20:22 PM GMT+8, Leepee notifications@github.com wrote:

As another workaround for this, you can login to the browser, and Scholar allows you a few more cracks at it, to get some more docs cited. If you have more issues, you can also "star" the document, and then go to "My Library" and cite from there. That's what I did and managed to cite 50 something articles just now.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

SimonDedman commented 4 years ago

I've found the VPN (tunnelbear is good since it's free but has a low allowance but you hardly need any for this purpose) + cite (quotation marks) links, works quite well, and used to for original qiqqa. However it only worked for about 50 papers (3 successful VPN nation changes) before giving up. I'll try again tomorrow.

Edit: confirmed working. Bit of a grind but I've completed my ~150 paper backlog using this method. I found better results VPN-ing to western countries (I'm based in the US and it lasted longer today with Canada, UK, Ireland, than yesterday when I included Mexico & Brazil, but that might be random variance). In any case: this is a viable approach. Though like I say, I manually check each one, I don't know if others somehow try to let it work automatically for all at once somehow.

m0rxy commented 3 years ago

Hi, just wanted to let you know that this is still an issue as one cannot complete the CAPTCHA in the sniffer browser.

Keep up the great works people :)

GerHobbelt commented 3 years ago

@m0rxy: Yup, still an issue. While the Scholar captcha problem won't go away, the underlying problem that's aggravating this is #2. Which will be addressed after I've taken care of the PDF background processes upgrade (PDF rendering, text extraction, OCR). Ergo: this will take a while before it is addressed, unfortunately.

GerHobbelt commented 3 years ago

Additional work-around has been posted in #310.