Cuadrix / puppeteer-page-proxy

Additional module to use with 'puppeteer' for setting proxies per page basis.
423 stars 99 forks source link

causes captcha, maybe missing headers #69

Open 123bistami opened 2 years ago

123bistami commented 2 years ago

Site is cloudflare protected. i have access to my target page with my proxy server behind it my proxy server is not on the blacklist. I have access to the target page when I set my proxy settings in the puppeteer args like [—proxy-server=127.0.0.1:4444], no captcha is displayed, but when I use proxy per page, captcha is displayed every time. Captcha is displayed too if i send request with curl i think cloudflare detects that the request not coming from browser or this package not pass all headers. Can someone help me?

mewforest commented 2 years ago

It seems, that this library catches every request from browser and uses own HTTP-client to override it (got.js).

So captcha isn't surprising here.

Cuadrix commented 1 year ago

Yes, this package doesn't send all headers because Puppeteer doesn't provide all of them in request.headers() .

This is because Puppeteer doesn't listen to the 'Network.requestWillBeSentExtraInfo' event which is fired after the request interception itself is completed, atleast according to this guy: https://github.com/puppeteer/puppeteer/issues/6117#issuecomment-652883842

This can be worked around by creating a cdp session manually, but this is tricky to implement into this package because of the above fact: https://stackoverflow.com/questions/47078655/missing-request-headers-in-puppeteer/62232903#62232903