berstend / puppeteer-extra

💯 Teach puppeteer new tricks through plugins.
https://extra.community
MIT License
6.38k stars 738 forks source link

[Bug] solveRecaptchas fails after finding recaptcha info #526

Open crisarji opened 3 years ago

crisarji commented 3 years ago

Describe the bug

Hi!, I am trying to crawl a web site for bookings of a service I am enrolled to.

Stack: NodeJS, Puppeteer, Heroku and 2captcha

Issue: I noticed that the web crawler works locally as expected, but when I deploy to Heroku, it fails over and over and over again; I started to run the page mimicking Heroku Prod and I found that there is a reCaptcha messing around.

I did a little research and found your library, I founded some money to my 2captcha account, tried the demo page and woks perfectly, then I tried to do it work the same as the demo but unsuccessfully.

Odd: the website I am crawling out does not show the reCaptcha but after submitting(is that normal?)

reCaptcha-after-submitting

Odder: Every single time it indicates that the verification code fails, even when waitForSelector('iframe[src*="recaptcha/"]') finds 1 iframe, and page.mainFrame().childFrames() has 2 occurrences

Oddest: it worked a few times in Heroku before I noticed the reCaptcha issue, I suppose that the server determined that it was being crawled and started to fail, that's my assumption; I have tried everything, waitUntil, waitForTimeout, catching the error and re-submitting, but nothing is working.

I really appreciate any help in here, I'd really love to fix this issue for writing a post in DevTo about web crawling using the stack aforementioned.

Thanks in advance!

Versions

"puppeteer": "^10.1.0", "puppeteer-extra": "^3.1.18", "puppeteer-extra-plugin-recaptcha": "^3.4.0"

berstend commented 2 years ago

Does the debug log contain something interesting here? :)

DEBUG=puppeteer-extra-plugin:recaptcha node myscript.js