paulirish / headless-cat-n-mouse

Is headless chrome currently detectable? Let's pit the detections and detection evasions against eachother.
Apache License 2.0
645 stars 56 forks source link

Overriden HTMLIFrameElement causes issues #28

Open lars-berger opened 5 years ago

lars-berger commented 5 years ago

The overriden HTMLIFrameElement.prototype.contentWindow causes issues for sites and libraries that use the method. Annoyingly, Google is using this in their reCAPTCHA v2 script.

Although it's perhaps outside of the scope of this package, is there a way to preserve the original functionality of iframe.contentWindow while still passing the test?

berstend commented 4 years ago

I'm pretty sure I found a way to mock contentWindow in iframes without breaking certain iframes (like the inline reCAPTCHA popup) after some tinkering. 😄

I've released it as v2.4 of puppeteer-extra-plugin-stealth, please ping me over there in case you should still encounter issues. :)

FWIW in my tests the only iframes that caused headaches were the ones using srcdoc, due to a chromium bug.

momala454 commented 4 years ago

I'm pretty sure I found a way to mock contentWindow in iframes without breaking certain iframes (like the inline reCAPTCHA popup) after some tinkering. 😄

I've released it as v2.4 of puppeteer-extra-plugin-stealth, please ping me over there in case you should still encounter issues. :)

FWIW in my tests the only iframes that caused headaches were the ones using srcdoc, due to a chromium bug.

does the issue means one could detect headless using iframe + srcdoc ?

momala454 commented 4 years ago

what i mean is if page.evaluateOnNewDocument fails, then we can also use by example :

iframe.contentWindow.webdriver

to detect headless ?

berstend commented 4 years ago

There's something weird about srcdoc frames in chromium/puppeteer - it's not only that page.evaluateOnNewDocument will not be executed in them but the corresponding iframe.contentWindow object in general felt funky (?) to use while playing around with it.

My "solution" is to hook into element creation events to sniff out iframes to then hook into potential set events of the srcdoc property to then (and only then) overload the contentWindow property with a "smart" proxy to the main page window object (that will intercept and modify calls that would reveal that it's the main window object). 😅

I had to be super surgical as the reCAPTCHA iframes are extremely sensitive to any iframe modifications.

The cool thing is that with the latest code stuff like this works correctly:

iframe.contentWindow.self === window.top // must be false
iframe.contentWindow.frameElement === iframe // must be true

Definitely among the most crazy detection + evasion games so far. :)

momala454 commented 4 years ago

ok did you saw my questions there ? https://github.com/berstend/puppeteer-extra/commit/17a42c3302ba1e7b446097b9aa2dd886ea6c8ef6