berstend / puppeteer-extra

💯 Teach puppeteer new tricks through plugins.
https://extra.community
MIT License
6.33k stars 737 forks source link

Live fingerprinting methods to evade #239

Closed andrew-healey closed 4 years ago

andrew-healey commented 4 years ago

These are a few live methods of detection which currently work (and are in use by bot detection services) to distinguish puppeteer-extra-plugin-stealth running headless from normal Chrome. I think I know a fix for one or two of these, but I think that the rest of them are up to other, more V8-savvy maintainers to evade. I will post more detection methods in the coming days.

  1. document.createElement
    try{
    document.createElement("dummy value")
    } catch({stack}) {
    if(stack.split("\n")[1].includes("Object.apply (<anonymous>"))
    console.log("This is puppeteer");
    }
  2. window dimensions
    if(window.outerHeight-window.innerHeight>160&&window.outerWidth-window.innerHeight>160)
    console.log("This is puppeteer");
  3. New detection method of navigator.webdriver
    if(!!Object.getOwnPropertyDescriptor(navigator.__proto__,"webdriver"))
    console.log("This is puppeteer");

    Fix: I am not sure about this at all, but possibly

    delete navigator.__proto__.webdriver;
  4. New detection method of navigator.languages
    if(!!Object.getOwnPropertyDescriptor(navigator, "languages"))
    console.log("This is puppeteer");
  5. console.debug
    if((console.debug+"").includes("return"))
    console.log("This is puppeteer");
Bllacky commented 4 years ago

Not sure if this is correct, but I tried this versions of removing navigator: const newProto = Object.getPrototypeOf(navigator); delete Object.getPrototypeOf(navigator).webdriver; Object.setPrototypeOf(navigator,newProto);

But I think I might be doing something wrong.

JimmyLaurent commented 4 years ago

Another ones used by cloudflare:

Array.isArray(navigator.plugins); // true with stealth hacks otherwise false
Array.isArray(navigator.mimeTypes); // true with stealth hacks otherwise false
brunogaspar commented 4 years ago

@Sesamestrong Thanks for the tricks, here's some feedback:

  1. document.createElement

Yields false to me.

  1. window dimensions

This behaves the other way around for me hehe, in one of my bots, this actually yields false, but on the real browser, this yields true, doesn't that accurate.

  1. New detection method of navigator.webdriver

I have this applied on my end aswell, but sometimes this doesn't work, haven't digged much into it.

  1. New detection method of navigator.languages

Yields false to me.

  1. console.debug

Yields false to me.

--

What versions are you running these on?

brunogaspar commented 4 years ago

@JimmyLaurent

Another ones used by cloudflare:

Array.isArray(navigator.plugins); // true with stealth hacks otherwise false
Array.isArray(navigator.mimeTypes); // true with stealth hacks otherwise false

hmm both yield false to me, are you sure it returned true on your end?

Tried with the stealth plugin enabled and disabled, same results here.

--

What versions are you running these on?

andrew-healey commented 4 years ago

@brunogaspar I am running this on headless puppeteer-extra-plugin-stealth vs. normal Chrome, both on Windows 10. Here is the code I used to test it:

const puppeteer = require('puppeteer-extra');

const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

[true,false].map(async (headless) => {
    const browser = await puppeteer.launch({headless})

    const page = await browser.newPage()
    let fingerprintObj={};
    console.log(headless);
    console.log(await page.evaluate(()=>{
        const detection={};
        try{
              document.createElement("dummy value")
        } catch({stack}) {
              if(stack.split("\n")[1].includes("Object.apply (<anonymous>"))
                detection.createElement=true;
        }
        if(Array.isArray(navigator.plugins))
            detection.navigatorPlugins=true;
        if(Array.isArray(navigator.mimeTypes))
            detection.navigatorMimeTypes=true;
        if(window.outerHeight-window.innerHeight>160&&window.outerWidth-window.innerHeight>160)
            detection.windowInnerHeight=true;
        if(!!Object.getOwnPropertyDescriptor(navigator.__proto__,"webdriver"))
            detection.navigatorProto=true;
        if(!!Object.getOwnPropertyDescriptor(navigator, "languages"))
            detection.navigatorLanguages=true;
        if((console.debug+"").includes("return"))
            detection.consoleDebug=true;
        return detection;
    }));
    await browser.close();
});

Here is the console output:

node index.js
true
{
  createElement: true,
  navigatorPlugins: true,
  navigatorMimeTypes: true,
  navigatorProto: true,
  navigatorLanguages: true,
  consoleDebug: true
}
false
{
  createElement: true,
  navigatorProto: true,
  navigatorLanguages: true,
  consoleDebug: true
}

Here is the result of running the code in DevTools in Chrome: image

I do agree about the window dimensions. The test was in a fingerprinting software I found, and Chrome vs. puppeteer-extra-plugin-stealth at the time showed a discrepancy, but now that I'm testing it, it doesn't work. However, all of the others do seem to work.

JimmyLaurent commented 4 years ago

Code to reproduce:

const puppeteer = require("puppeteer-extra");
const StealthPlugin = require("puppeteer-extra-plugin-stealth");

puppeteer.use(StealthPlugin());

async function runTest(headless) {
    let browser;
    try {
        browser = await puppeteer.launch({ headless });
        const page = await browser.newPage();
        await page.goto("https://bot.sannysoft.com");
        const pluginsIsAnArray = await page.evaluate(() =>
            Array.isArray(navigator.plugins)
        );

        console.log("pluginsIsAnArray:", pluginsIsAnArray);
    } finally {
        if (browser) {
            await browser.close();
        }
    }
}

(async () => {
    await runTest(true);
    await runTest(false);
})();

Chrome version: mac-756035 Packages versions:

image

andrew-healey commented 4 years ago

@JimmyLaurent It seems we've posted our code just seconds apart. Also, for what it's worth, this POC uses the following detection method:

function myBotCheck() {
    let err = new Error('test err');
    console.log('err.stack: ', err.stack);
    if (err.stack.toString().includes('puppeteer')) {
        document.getElementById('yesOrNo').innerHTML = 'Yes';
    }
}

function overrideFunction(item) {
    item.obj[item.propName] = (function (orig) {
        return function () {

            myBotCheck();

            let args = arguments;
            let value = orig.apply(this, args);

            return value;
        };

    }(item.obj[item.propName]));
}

overrideFunction({
    propName: 'querySelector',
    obj: document
});
JimmyLaurent commented 4 years ago

@Sesamestrong 😄 Someone posted a hack for this one: https://github.com/berstend/puppeteer-extra/issues/209#issuecomment-642988817

@brunogaspar Can you also give us your setup please ?

momala454 commented 4 years ago

look at my messages there https://github.com/berstend/puppeteer-extra/issues/218 i mentioned multiple things wrong on puppeteer-extra that must be corrected

JimmyLaurent commented 4 years ago

@Sesamestrong Check your email inbox

prescience-data commented 4 years ago

I've had some success running all puppeteer commands in their own isolated world, leaving the main context unpolluted:

https://github.com/berstend/puppeteer-extra/issues/224

momala454 commented 4 years ago

can someone confirm if you have the same ? navigator.mimeTypes[0].enabledPlugin is crashing only when puppeteer-extra stealth is enabled ?

navigator.plugins[0].hasOwnProperty('namedItem') returns true only with stealth

try { navigator.plugins.namedItem();} catch(e){ console.log(e.stack.toString());} is leaking the custom function code too

navigator.userActivation seems to contains 2 attribute at false on headless, while true on headful

andrew-healey commented 4 years ago

I am writing a PR right now to solve many of these issues, except for one. I think we should talk more about how to solve this problem:

@JimmyLaurent It seems we've posted our code just seconds apart. Also, for what it's worth, this POC uses the following detection method:

function myBotCheck() {
    let err = new Error('test err');
    console.log('err.stack: ', err.stack);
    if (err.stack.toString().includes('puppeteer')) {
        document.getElementById('yesOrNo').innerHTML = 'Yes';
    }
}

function overrideFunction(item) {
    item.obj[item.propName] = (function (orig) {
        return function () {

            myBotCheck();

            let args = arguments;
            let value = orig.apply(this, args);

            return value;
        };

    }(item.obj[item.propName]));
}

overrideFunction({
    propName: 'querySelector',
    obj: document
});

One person posted a potential solution to this:

Adding this bit of trickery avoids detection:

await page.evaluateOnNewDocument(() => {
  const errors = { Error, EvalError, RangeError, ReferenceError, SyntaxError, TypeError, URIError };
  for (const name in errors) {
    globalThis[name] = (function(NativeError) {
      return function(message) {
        const err = new NativeError(message);
        const stub = {
          message: err.message,
          name: err.name,
          toString: () => err.toString(),
          get stack() {
            const lines = err.stack.split('\n');
            lines.splice(1, 1); // remove anonymous function above
            lines.pop(); // remove puppeteer line
            return lines.join('\n');
          },
        };
        if (this === globalThis) {
          // called as function, not constructor
          stub.__proto__ = NativeError;
          return stub;
        }
        Object.assign(this, stub);
        this.__proto__ = NativeError;
      };
    })(errors[name]);
  }
});

This solution is inadequate, as is automatically removing any line containing the word "puppeteer". One detection method applies to both theoretical evasions:

const isHeadless=eval(`
new Error().stack;
//# sourceURL=__puppeteer_evaluation_script__
`)!==`Error
    at eval (__puppeteer_evaluation_script__:2:1)
    at <anonymous>:1:1`

Of course, this is one of countless similar setups that could be made to confound automatic removal of __puppeteer_evaluation_script__ from stack traces. I am convinced that the solution involves modifying ExecutionContext.prototype._evaluateInternal so that it does not set the sourceURL of evaluated code to be __puppeteer_evaluation_script__. Personally, I think that modifying ExecutionContext.prototype._evaluateInternal somewhere in puppeteer-extra-plugin-stealth is the solution, but I'm not sure where to put it in the filesystem. For now, I'm excluding this one detection method from my PR. Does anybody know where in the code an override for ExecutionContext.prototype._evaluateInternal should be put?

andrew-healey commented 4 years ago

can someone confirm if you have the same ? navigator.mimeTypes[0].enabledPlugin is crashing only when puppeteer-extra stealth is enabled ?

navigator.plugins[0].hasOwnProperty('namedItem') returns true only with stealth

try { navigator.plugins.namedItem();} catch(e){ console.log(e.stack.toString());} is leaking the custom function code too

navigator.userActivation seems to contains 2 attribute at false on headless, while true on headful

Yes; these things are in need of improvement, as some of the setup of navigator.mimeTypes and navigator.plugins is problematic; for example it, fails a test that @JimmyLaurent describes:

Another ones used by cloudflare:

Array.isArray(navigator.plugins); // true with stealth hacks otherwise false
Array.isArray(navigator.mimeTypes); // true with stealth hacks otherwise false

About custom error traces and detecting non-native functions generally, I am currently writing a general solution to this (pretty common) problem in puppeteer-extra-plugin-stealth that will be in my PR. I have not seen what you're saying about navigator.userActivation before, so I'm adding that to my PR.

momala454 commented 4 years ago

another thing chrome.runtime is undefined when not in https, so one has to be meticulous when handling some of the functionnalities

edit: bluetooth api is only working if the flag --enable-experimental-web-platform-features is set for linux, while on windows it doesn't need this flag. However enabling this flag will probably introduce other differences between linux and windows, so i think we should write our own bluetooth api (that will return that no bluetooth is available).

I hope i'm not just helping the persons detecting chrome headless ;)

andrew-healey commented 4 years ago

I think that the polyfill for chrome as it stands right now is pretty weak; the functions do not appear native, chrome.webstore still exists, chrome.csi and chrome.loadTimes do not appear native, and the list goes on. I would appreciate a more comprehensive list of issues with chrome in general, as I am fixing as much as I can and am sure that I'm missing some things. One step towards good stealth is a good way to modify builtin functions so that the modified function is indistinguishable from the original. Here's my code for doing so:

const whitelist=new Map();
const oldToString=Function.prototype.toString;
const handleError=(func,lineNumsToRemove)=>{
    try{
        return func();
    } catch (err) {
        const lines=err.stack.split("\n");
        err.stack=lines.filter((line,idx)=>!lineNumsToRemove.includes(idx)).join("\n");
        throw err;
    }
};
const nativeFunction=(self,name,func)=>{
    console.log(self,name);
    const old=self[name];
    const stringified=self[name]+"";
    let isStrict=true;
    try{
        old.arguments;
        isStrict=false;
    } catch {}
    let ret;
    if(isStrict){
        'use strict';
        ret=function(...args){
            handleError(old.bind(this),[1,2]);
            return func(...args);
        };
    }
    else {
        ret=function(...args){
            handleError(old.bind(this),[1,2]);
            return func(...args);
        };
    }
    Object.defineProperties(ret,Object.getOwnPropertyDescriptors(old));
    whitelist.set(ret,stringified);
    return ret;
};
const redefineNativeGetter=(obj,prop,func)=>{
    obj.__defineGetter__(prop,nativeFunction(Object.getOwnPropertyDescriptor(obj,prop),"get",func));
};
'use strict';
function toString(){
    const original = handleError(oldToString.bind(this),[2,3]);
    if(whitelist.has(this)) return whitelist.get(this);
    return original;
};
Function.prototype.toString=toString;
whitelist.set(Function.prototype.toString,"function toString() { [native code] }");

So, for example, I would use this to modify navigator.languages:

redefineNativeGetter(navigator.__proto__,"languages",()=>Object.freeze(["en-US","en"]));

Does anybody know of a way to detect that this function has been used? I want to make sure that it's undetectable.

berstend commented 4 years ago

@Sesamestrong thanks for the research on optimizing the stealth functionality. 👍

Quick note: The various stealth plugins differ in quality (as they've been added over the course of 3 years) and use different ways to accomplish often the same thing (some with drawbacks of e.g. not overwriting toString correctly).

It could be useful to create a set of shared/common functions to monkey patch or extend missing functionality going forward.

I'm currently focussed on updating the project in general and adding contributors to the mix, once that more foundational work is done the focus will shift more towards optimizing stealth. :-)

momala454 commented 4 years ago

this is what i have modified for window.chrome, it's just a small improvement over what is currently on puppeteer-extra, but it fixes tiktok by implementing some dummy of chrome.runtime.connect:

if (!window.chrome) {
        /*const installer = {
            install() {}
        }*/
        window.chrome = {
            app: {
                isInstalled: false,
                InstallState: {
                    DISABLED: 'disabled',
                    INSTALLED: 'installed',
                    NOT_INSTALLED: 'not_installed'
                },
                RunningState: {
                    CANNOT_RUN: 'cannot_run',
                    READY_TO_RUN: 'ready_to_run',
                    RUNNING: 'running'
                },
                getDetails: function() {}.bind(function () {}),
                getIsInstalled: function() {}.bind(function () {}),
                installState: function() {}.bind(function () {}),
                runningState: function() {}.bind(function () {}),

            },
            /*webstore: {
                onInstallStageChanged: {},
                onDownloadProgress: {},
                install(url, onSuccess, onFailure) {
                    installer.install(url, onSuccess, onFailure)
                }
            },*/
            csi: function() {}.bind(function () {}),// must be implemented
            loadTimes: function() {}.bind(function () {})// must be implemented
        }
    }

    if (!window.chrome.runtime) {
        window.chrome.runtime = {
            PlatformOs: {
                MAC: 'mac',
                WIN: 'win',
                ANDROID: 'android',
                CROS: 'cros',
                LINUX: 'linux',
                OPENBSD: 'openbsd'
            },
            PlatformArch: {
                ARM: 'arm',
                X86_32: 'x86-32',
                X86_64: 'x86-64',
                MIPS: 'mips',
                MIPS64: 'mips64'
            },
            PlatformNaclArch: {
                ARM: 'arm',
                X86_32: 'x86-32',
                X86_64: 'x86-64',
                MIPS: 'mips',
                MIPS64: 'mips64'
            },
            RequestUpdateCheckStatus: {
                THROTTLED: 'throttled',
                NO_UPDATE: 'no_update',
                UPDATE_AVAILABLE: 'update_available'
            },
            OnInstalledReason: {
                INSTALL: 'install',
                UPDATE: 'update',
                CHROME_UPDATE: 'chrome_update',
                SHARED_MODULE_UPDATE: 'shared_module_update'
            },
            OnRestartRequiredReason: {
                APP_UPDATE: 'app_update',
                OS_UPDATE: 'os_update',
                PERIODIC: 'periodic'
            },
            connect: function () {
                return {
                    disconnect: function () {}.bind(function () {}),
                    onDisconnect: {
                        addListener: function () {}.bind(function () {}),// must be implemented ?
                        dispatch: function () {}.bind(function () {}),// must be implemented
                        hasListener: function () {}.bind(function () {}),// must be implemented
                        hasListeners: function () {}.bind(function () {}), // must be implemented
                        removeListener: function () {}.bind(function () {}),// must be implemented ?
                    },
                    onMessage: {
                        addListener: function () {}.bind(function () {}),// must be implemented ?
                        dispatch: function () {}.bind(function () {}),// must be implemented
                        hasListener: function () {}.bind(function () {}),// must be implemented
                        hasListeners: function () {}.bind(function () {}), // must be implemented
                        removeListener: function () {}.bind(function () {}),// must be implemented ?
                    },
                    postMessage: function () { 
                        if (arguments.length < 1)
                            throw TypeError('Insufficient number of arguments.');
                        throw Error('Attempting to use a disconnected port object');
                    }.bind(function () {}),
                    name: '',
                    sender: undefined
                };
            }.bind(function () {}),
            sendMessage: function () {}.bind(function () {}),// must be implemented
            id: undefined,
        }
    }

the .bind() method is not a good idea, it return the correct string when doing .toString() (function(){ native} but when doing .name it returns "bind".

and i don't know how to detect if we're in https or not, to implement chrome.runtime only in https

momala454 commented 4 years ago

I think that the polyfill for chrome as it stands right now is pretty weak; the functions do not appear native, chrome.webstore still exists, chrome.csi and chrome.loadTimes do not appear native, and the list goes on. I would appreciate a more comprehensive list of issues with chrome in general, as I am fixing as much as I can and am sure that I'm missing some things. One step towards good stealth is a good way to modify builtin functions so that the modified function is indistinguishable from the original. Here's my code for doing so:

whitelist=new Map();
oldToString=Function.prototype.toString;
handleError=(func,lineNumsToRemove)=>{
    try{
        return func();
    } catch (err) {
        const lines=err.stack.split("\n");
        err.stack=lines.filter((line,idx)=>!lineNumsToRemove.includes(idx)).join("\n");
        throw err;
    }
};
nativeFunction=(self,name,func)=>{
    console.log(self,name);
    const old=self[name];
    const stringified=self[name]+"";
    const ret=function(...args){
        handleError(old.bind(this),[1,2]);
        return func(...args);
    };
    Object.defineProperty(ret,"name",{value:old.name,writable:false,enumerable:false,configurable:true});
    whitelist.set(ret,stringified);
    return ret;
};
redefineNativeGetter=(obj,prop,func)=>
    obj.__defineGetter__(prop,nativeFunction(Object.getOwnPropertyDescriptor(obj,prop),"get",func));

Function.prototype.toString=function toString(){
    const original = handleError(oldToString.bind(this),[2,3]);
    if(whitelist.has(this)) return whitelist.get(this);
    return original;
};
whitelist.set(Function.prototype.toString,"function toString() { [native code] }");

So, for example, I would use this to modify navigator.languages:

redefineNativeGetter(navigator.__proto__,"languages",()=>["location","language"]);

Does anybody know of a way to detect that this function has been used? I want to make sure that it's undetectable.

the code doesn't work, it throw Uncaught ReferenceError: Cannot access 'oldToString' before initialization

it works if i replace those 2 lines

let whitelist=new Map();
let oldToString=Function.prototype.toString;

however, Object.getOwnPropertyDescriptors(navigator.languages) returns

Array
(
    [0] => Array
        (
            [value] => location
            [writable] => 1
            [enumerable] => 1
            [configurable] => 1
        )

    [1] => Array
        (
            [value] => language
            [writable] => 1
            [enumerable] => 1
            [configurable] => 1
        )

    [length] => Array
        (
            [value] => 2
            [writable] => 1
            [enumerable] =>
            [configurable] =>
        )

)

writable and configurable should be set to false, and for "length" enumerable should be false too. Here is my code that works

Object.defineProperty(navigator.__proto__, 'languages', {
        value: Object.freeze(['en-US', 'en']),
        writable:false,
        enumerable:true,
        configurable:false
    });

which gives

Array
(
    [0] => Array
        (
            [value] => en-US
            [writable] =>
            [enumerable] => 1
            [configurable] =>
        )

    [1] => Array
        (
            [value] => en
            [writable] =>
            [enumerable] => 1
            [configurable] =>
        )

    [length] => Array
        (
            [value] => 2
            [writable] =>
            [enumerable] =>
            [configurable] =>
        )

)
andrew-healey commented 4 years ago

My method works correctly (I have not had trouble with the first two lines, but I have since made other edits and updated the post. In normal Chrome, this is the property descriptor for navigator.__proto__.languages: image After running my code, the property descriptor is: image

momala454 commented 4 years ago

Descriptors, with s

Object.getOwnPropertyDescriptors(navigator.languages)

andrew-healey commented 4 years ago

What do you mean? The function is Object.getOwnPropertyDescriptor, singular. Your proposed method of changing it results in the following: image So it is possible to detect your script just by using the following code:

const isHeadless=!!Object.getOwnPropertyDescriptor(navigator.__proto__,"languages").value;
momala454 commented 4 years ago

Object.getOwnPropertyDescriptor and Object.getOwnPropertyDescriptors are two different functions

andrew-healey commented 4 years ago

Yes, they are; the reason that I used Object.getOwnPropertyDescriptor was to show that your method of changing navigator.languages is detectable.

momala454 commented 4 years ago

ok but your method is detectable using Object.getOwnPropertyDescriptors(navigator.languages)

i think you need to change to redefineNativeGetter(navigator.__proto__,"languages",()=>Object.freeze(["location","language"]));

andrew-healey commented 4 years ago

Ah, that's what you mean. Sorry; I interpreted your comment as saying that the code snippet alone would fix it. I have fixed the code.

momala454 commented 4 years ago

is it needed to have two "use strict" ?

andrew-healey commented 4 years ago

No; I've removed the first instance of 'use strict';.

momala454 commented 4 years ago

you can also detect chrome headless/linus by getting the different dimentions, as linux have 15px of scrollbar on headless while on windows it's 17, and headless doesn't have an url/title bar, so the different sizes returned will be different than headful

edit: the size of the scrollbar is 15 when not in windows : https://github.com/chromium/chromium/blob/2ca8c5037021c9d2ecc00b787d58a31ed8fc8bcb/ui/gfx/scrollbar_size.cc#L20

edit the webgl vendor evasion can also be detected WebGLRenderingContext.prototype.getParameter.toString.name returns "bound toString" instead of toString

andrew-healey commented 4 years ago

Also, it seems that there are no evasions for WebGL2RenderingContext.

momala454 commented 4 years ago

another important thing, by example the webgl evasion script use this return Reflect.apply(target, thisArg, args)

so to detect this evasion, you could redefine Reflect.apply, call WebGLRenderingContext.prototype.getParameter() and analyse the stack. Is there any other solution than making a copy of every possible functions and use that copy on the evasion scripts ?

andrew-healey commented 4 years ago

Well, an evasion could always just define a local variable:

const reflectApply=Reflect.apply.bind(Reflect);

On another note, one thing I found from testing out how to convincingly mock native functions is that in order to be perfectly undetectable, we will have to use the "nuclear option" of overriding all built-in property descriptor functions. I don't want to get into why here, but the nuclear option seems like it requires changing many things and I plan on doing it in a future PR.

momala454 commented 4 years ago

so your function is not properly hiding the redefining ? https://github.com/berstend/puppeteer-extra/issues/239#issuecomment-654338552

berstend commented 4 years ago

On another note, one thing I found from testing out how to convincingly mock native functions is that in order to be perfectly undetectable, we will have to use the "nuclear option" of overriding all built-in property descriptor functions. I don't want to get into why here, but the nuclear option seems like it requires changing many things and I plan on doing it in a future PR.

In order to win the next round of cat & mouse we need to change our approach. We need better ways to mock native APIs and use those consistently throughout all evasions. What we have right now is a grown patch-work that served it's purpose in the beginning, but with the popularity of the project it's too easy for the other team to find cracks in them. :-)

So we either develop a set of robust utility functions that are being used by all evasions or as you've mentioned we go "nuclear" and low-level proxy everything.

andrew-healey commented 4 years ago

I am of the opinion that util functions alone will not work. I tested with them for a while and found a few insurmountable problems which can only be solved with the nuclear option.

momala454 commented 4 years ago

what are them ? or you don't want to give a hint to the opposite camp ? :)

andrew-healey commented 4 years ago

I do not want to tell the opposite camp. I'm asking @berstend privately, though.

momala454 commented 4 years ago

as i proved that i'm willing to help by giving things that needs to be fixed etc, if you're ok i'm interested to receive privately the information too. Maybe create a private repo to discuss with select people ?

ruimarinho commented 4 years ago

Great analysis by Smitop on methods used by WhiteOps: https://smitop.com/post/whiteops-data/

andrew-healey commented 4 years ago

Here's a new, simpler set of util functions:

  const nativeGetProto=Object.getPrototypeOf(navigator.__lookupGetter__("languages"));
  const proxyToOriginal=new Map;
  /**
   * target - the target to be Proxied
   * handlers - the handlers to apply to the target
   */
  const stealthProxy=(target,handlers)=>{
    const ret=new Proxy(target, handlers);
    proxyToOriginal.set(ret,target);
    return ret;
  };
  /**
   * original - the Function, native function or arrow function to mock
   * func - a Function, native function or arrow function holding the desired logic
   * {
     * hasSideEffects - a Boolean representing if original has side effects. If true, the original is not run to test for errors. If false, the original is run to test for errors.
     * isTrapStyle - a Boolean representing if func will be run with Reflect.apply(func, thisArg, args) or func(target,thisArg,args).
     * modifyProto - a Boolean representing if the prototype of original should also be Proxied.
   }
   */
     const mockFunction = (original, func,{hasSideEffects=false,isTrapStyle=false,modifyProto=true}={}) => {
       const apply=(oldFunc, thisArg, args) => {
         const lineNumsToRemove = [2];
         try {
           if(oldFunc&&!hasSideEffects) Reflect.apply(oldFunc, thisArg, args);
           return isTrapStyle?func(oldFunc,thisArg,args):Reflect.apply(func, thisArg, args);
         } catch (err) {
           const lines = err.stack.split("\n");
           err.stack = lines.filter((line, idx) => !lineNumsToRemove.includes(idx)).join("\n");
           throw err;
         }
       };
       const handlers={
         apply
       };
       // If modifyProto is true, override the prototype as well
       const toProxy=modifyProto?Object.setPrototypeOf(original,stealthProxy(Object.getPrototypeOf(original),handlers)):original;
       // Override the function itself
       return stealthProxy(toProxy,handlers);
     };

  const redefineFunction = (obj, prop, func, options) =>
    obj[prop] = mockFunction(obj[prop], func, options);
  const redefineGetter = (obj, prop, func, options) =>
    obj.__defineGetter__(prop, Object.setPrototypeOf(mockFunction(Object.getOwnPropertyDescriptor(obj,prop).get, func, options),nativeGetProto));

  [Function,Object].forEach(className=>
    redefineFunction(className.prototype,"toString",function(target,thisArg,args){
      return Reflect.apply(target,proxyToOriginal.has(thisArg)?proxyToOriginal.get(thisArg):thisArg,arguments);
    },{
      isTrapStyle:true
    })
  );
redefineGetter(navigator.__proto__,"languages",()=>Object.freeze(["en-US","en"]))

Again, I would appreciate any feedback.

momala454 commented 4 years ago

lookupGetter is deprecated, maybe use "Object.getOwnPropertyDescriptor() or Object.getPrototypeOf()." instead ? about your example, it should be 'en-US', 'en' not EN-us

andrew-healey commented 4 years ago

@momala454

  1. Fixed
  2. Fixed Todo: toString is acting up again
momala454 commented 4 years ago

this call the original function and the new one, so this does not "replace" a function

redefineFunction(window,"alert",(...a)=>console.log(...a));

will not prevent the alert

prescience-data commented 4 years ago

Great analysis by Smitop on methods used by WhiteOps: https://smitop.com/post/whiteops-data/

There is a ton of stuff here that should be addressed... Nice find.

berstend commented 4 years ago

@Sesamestrong my impression is that Proxies are the way to go, they have the stated intention that their presence is undetectable from within the same JS context and being able to intercept all ways to interact with objects.

As an aside, have a look at this MDN doc: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Proxy

And scroll to "A complete traps list example".

JS Proxies support a bunch of different traps, get or apply is just scratching the surface. In case a trap isn't defined the default behaviour is to pass things through to the target (which we usually don't want), therefore it'd make sense (especially for a shared utility function) to define any and all traps for maximum control.

I once had a nice overview of traps and what respectively triggers them (e.g. toString) but can't seem to find it right now.

edit, link to ECMAscript specification: https://www.ecma-international.org/ecma-262/9.0/#sec-proxy-object-internal-methods-and-internal-slots

edit2, I not only think that Proxies are the way to go to modify puppeteer-revealing stacktraces but basically as the underpinning of virtually all detection evasion techniques

berstend commented 4 years ago

Also, it seems that there are no evasions for WebGL2RenderingContext.

Adressed in #256

berstend commented 4 years ago

90% of the issues mentioned here have been fixed today :-)

Published as puppeteer-extra-plugin-stealth@2.5.0

What's left from this issue is optimizing chrome.runtime (needs more spoofing) and navigator.plugins (needs better spoofing so e.g. the Array tests fails).

berstend commented 4 years ago

I'm in the process of rewriting the window.chrome evasions (and navigator.plugins afterwards) to be fully functional and native appearing mocks.

The process can be tracked in #292 - chrome.runtime.sendMessage and chrome.runtime.connect are already finished.

berstend commented 4 years ago

Everything mentioned in here has been fixed in puppeteer-extra-plugin-stealth@2.6.1.