Lookyloo / PlaywrightCapture

Capture a URL with Playwright
29 stars 3 forks source link

Improve stealth/ bypass bot detection #55

Open Rafiot opened 11 months ago

Rafiot commented 11 months ago

Give a try to https://github.com/AtuboDad/playwright_stealth

Rafiot commented 10 months ago

This one is now implemented, and it helped quite a bit.

Nest step is to use the new headless, which will be the default soon-ish.

There are two approach for that:

if self.browser_name == 'chromium':                        
    launch_settings['ignore_default_args'] = ["--headless"]
    launch_settings['args'] = ["--headless=new"]           

Or running the capture script with the following environment variable: PLAYWRIGHT_CHROMIUM_USE_HEADLESS_NEW=1

As the new headless will be the default relatively soon, implementing all the way from Python capture to the Lookyloo UI via Lacus is a bit of a pain, so the recommandation will be to enable (or not) this variable when running the capture script.

Rafiot commented 10 months ago

Cloudflare bypass is tricky, and almost never works.

Let's try with hover: https://playwright.dev/python/docs/api/class-locator#locator-hover