Closed johndoe-dev00 closed 3 years ago
@johndoe-dev00 In your testing (after the changes you made), did you find that the captcha still showed up? If so, did the page actually go away after you solved the captcha?
Also, why did you make the maximum time to solve the captcha = one minute? Is there a specific need that it cannot be longer?
@rocketinventor My changes do not prevent the captcha from showing up. At the beginning of my testing the captcha would show up frequently. After a while it became less frequent. Currently it does not show up at all anymore, even after deleting the cookie file. Maybe cloudflare white listed my ip or something. When the captcha actually does show up, you do need to solve it manually. After solving, you will be redirected back to blinkist and the scraper will continue its work (when the blinkist logo is detected). As posted by albert in #42, the captcha will fail to load correctly and you will not be able to proceed if uBlock is enabled. Hence the new command line switch '--no-ublock'
Why 60 sec wait time? 60s should be plenty to solve the captcha. In case someone is not watching the cli output, I don´t want him to wait 10min before timing out.
If the only reason that uBlock needs to be disabled is to solve the captcha, then you can easily add it to the whitelist (the captcha was being intentionally blocked before):
At the bottom of the bin/ublock/ublock-settings.txt
file, there should be a block of text, such: www.blinkist.com hcaptcha.com * block
.
Change it to look like this:
www.blinkist.com hcaptcha.com * allow
@rocketinventor I changed the ublock-settings.txt
to allow hcaptcha.com. Seems to work quite well.
I still kept the cli-switch --no-ublock
in place, as i see it quite useful for troubleshooting.
FYI: Switching between from seleniumwire import webdriver
and from selenium import webdriver
(=book audio scrape not working) seems to trigger the captchas. Convenient for testing :)
I had some trouble with the login process and the captcha.