leoncvlt / blinkist-scraper

📚 Python tool to download book summaries and audio from Blinkist.com, and generate some pretty output
190 stars 36 forks source link

Infinite captcha #51

Closed birlikov closed 3 years ago

birlikov commented 3 years ago

Hi! Thanks for amazing repo! When I run python blinkistscraper email pass chrome asks to solve the captcha, after solving the first captcha logs print INFO Logged into Blinkist. Loading Library... but nothing happens, also chrome repeatedly asks to pass captchas again and again. How to solve this issue?

rocketinventor commented 3 years ago

Hello @birlikov,

The issue that you are reporting about the "Infinite captcha" problem is a known issue (#46) and has come up before (#31). I was actually working on these two issues before (better logging and avoiding captchas), which should have been ready by now...

However, there were some changes made to the site recently - the main script itself on the page can now detect the chromedriver instance and (repeatedly) send the user to a new captcha page (independently of Cloudflare detecting the "bot"). Before it was just in a few places (i.e. right after login and from Cloudflare on the first page-load when the traffic looked suspicious), but now it is on every page. There is no solution to this yet.

The quick fix (for now) is to disable javascript on blinkist.com pages. I can help you do that, but know that for now, it will also mean that audio cannot be downloaded anymore. Also, you can try out the changes on my pull request (#50)... It enables integration with undetected-chromedriver that gets activated when you update selenium-wire to the latest version and install undetected-chromedriver.

Since this issue appears to be a duplicate, I recommend that you close it and follow on issue #46, which is an active thread. I will be looking for other solutions and posting updates over there.

birlikov commented 3 years ago

Hi @rocketinventor ,

Thanks for explanation, I will close this issue and follow #46.

Also, your suggested quick fix is fine for me because I just need texts not audio, so any instructions are welcome)