AtuboDad / playwright_stealth

playwright stealth
MIT License
541 stars 71 forks source link

Doesn't work for a headless detector website #30

Open agn-7 opened 5 months ago

agn-7 commented 5 months ago

Here's the code:

import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import stealth_async
async def main():
    async with async_playwright() as p:
        # launch the browser
        browser = await p.chromium.launch()
        # open a new page
        page = await browser.new_page()

        # register the Playwright Stealth plugin
        await stealth_async(page)

        # visit the target page
        await page.goto("https://arh.antoinevastel.com/bots/areyouheadless")

        # extract the message contained on the page
        message_element = page.locator("#res")
        message = await message_element.text_content()

        # print the resulting message
        print(f'The result is: "{message}"')

        # close the browser and release its resources
        await browser.close()

asyncio.run(main())

Expected result should be The result is: "You are not Chrome headless" but the actual result is The result is: "You are Chrome headless"

webdz9r commented 4 months ago

confirmed

darkzbaron commented 4 months ago

+1

chrisspen commented 2 months ago

According to this it's actually quite easy to defeat that site because it relies on hacky superficial differences between browsers and automations. Tricking it is as simple as passing your "accept-language" header over as "Accept-Language".

This worked for me:

from django.contrib.staticfiles.testing import StaticLiveServerTestCase

from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync

class Tests(StaticLiveServerTestCase):

    @classmethod
    def setUpClass(cls):
        os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
        super().setUpClass()

        cls.playwright = sync_playwright().start()
        cls.browser = cls.playwright.chromium.launch(headless=not SHOW, args=["--disable-blink-features=AutomationControlled"])

    @classmethod
    def tearDownClass(cls):
        super().tearDownClass()
        if cls.stop_browser:
            cls.browser.close()
        cls.playwright.stop()

    def setUp(self):
        super().setUp()

        ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3803.0 Safari/537.36'
        context = self.browser.new_context(
            user_agent=ua,
            extra_http_headers={
                "Accept-Language": "en-US,en;q=0.9"
            }
        )
        self.page = context.new_page()
        stealth_sync(self.page)

    def test_undetectable(self):

        response = self.page.goto('https://arh.antoinevastel.com/bots/areyouheadless')
        self.assertEqual(response.status, 200)

        # extract the answer contained on the page
        answer_element = self.page.locator("#res")
        answer = answer_element.text_content()

        # print the resulting answer
        print(f'The result is: "{answer}"')
        self.assertTrue('You are Chrome headless' not in answer)
Mattwmaster58 commented 2 months ago

@chrisspen looks like the basis for another evasion tactic. Although the current maintainer seems inactive, I've just requested maintainership.