Open agn-7 opened 5 months ago
confirmed
+1
According to this it's actually quite easy to defeat that site because it relies on hacky superficial differences between browsers and automations. Tricking it is as simple as passing your "accept-language" header over as "Accept-Language".
This worked for me:
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
class Tests(StaticLiveServerTestCase):
@classmethod
def setUpClass(cls):
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
super().setUpClass()
cls.playwright = sync_playwright().start()
cls.browser = cls.playwright.chromium.launch(headless=not SHOW, args=["--disable-blink-features=AutomationControlled"])
@classmethod
def tearDownClass(cls):
super().tearDownClass()
if cls.stop_browser:
cls.browser.close()
cls.playwright.stop()
def setUp(self):
super().setUp()
ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3803.0 Safari/537.36'
context = self.browser.new_context(
user_agent=ua,
extra_http_headers={
"Accept-Language": "en-US,en;q=0.9"
}
)
self.page = context.new_page()
stealth_sync(self.page)
def test_undetectable(self):
response = self.page.goto('https://arh.antoinevastel.com/bots/areyouheadless')
self.assertEqual(response.status, 200)
# extract the answer contained on the page
answer_element = self.page.locator("#res")
answer = answer_element.text_content()
# print the resulting answer
print(f'The result is: "{answer}"')
self.assertTrue('You are Chrome headless' not in answer)
@chrisspen looks like the basis for another evasion tactic. Although the current maintainer seems inactive, I've just requested maintainership.
Here's the code:
Expected result should be
The result is: "You are not Chrome headless"
but the actual result isThe result is: "You are Chrome headless"