berstend / puppeteer-extra

💯 Teach puppeteer new tricks through plugins.
https://extra.community
MIT License
6.49k stars 743 forks source link

Mutation observers #264

Closed prescience-data closed 4 years ago

prescience-data commented 4 years ago

Reading over this (2018) method of extracting canvas noise vectors is interesting, as I can see it would potentially be trivial to abstract that technique out to catch scripts injected into the main execution context (and potentially reverse them).

https://antoinevastel.com/tracking/2018/07/01/eval-canvasdef.html

The concerning parts:

The technique we present hereafter relies on the fact that browser extensions content scripts do not execute in the same execution context than the web page for security purposes. Thus, to execute the script that overrides the canvas functions in the same context as the web page, Canvas Defender injects a script element in the DOM. The script contains the code to override toDataURL, getImageData. Once the script has been executed it auto deletes itself.

const observer = new MutationObserver((mutations) => {
    mutations.forEach((mutation) => {
        var beginScript = "try{(function overrideDefaultMethods(r, g, b, a,";
        if (mutation.addedNodes.length === 1 &&
        mutation.addedNodes[0].text !== undefined && mutation.addedNodes[0].text.indexOf(beginScript) > -1) {
            const noise = mutation
            .addedNodes[0]
            .text
            .match(/\d{1,2},\d{1,2},\d{1,2},\d{1,2}/)[0]
            .split(",");
            console.log(noise);
        }
    });
});

const config = {childList: true, subtree: true};
observer.observe(document.documentElement, config);

The snippet of code above uses the MutationObserver API to detect when the script injected by Canvas Defender is added to the DOM. Once the script is detected, it can extract the noise vector.

The solution we adopt relies on cloning the toDataURL function before it gets overridden by Canvas Defender. To do so we execute the following code:

const getOriginalFunction = Function.prototype.call.bind(
    Function.prototype.bind,
    Function.prototype.call
);
const oldToDataURL = getOriginalFunction(HTMLCanvasElement.prototype.toDataURL);

Does anyone know if this has this been handled since then? Ie is there a way to hide the injections from the Mutation Observer API?

prescience-data commented 4 years ago

Out of interest, is anyone obfuscating the scripts they inject into the main context with something like: https://github.com/javascript-obfuscator/javascript-obfuscator

It doesn't address the actual problem in the method, but I imagine it would make it a damn bit harder?

GJuniarto commented 4 years ago

Right now canvas fingerprint starting to get detected from browser validations. Multilogin team(im not the team lul) right now is on research for new tech called natural canvas fingerprint that the goal can spoof the javascripts/browser fingerprint from the browser perfectly. The point is not to mimic it but make it same so there is no way serverside can detect at least until few year again lul like first time antidetect browser

prescience-data commented 4 years ago

The aim of the post wasn't about canvas part of the problem, it was the mutation observers that are monitoring scripts injected into the page and searching for patterns / signatures, then undoing the changes (ie undoing deletion of webdriver etc).

My first "quick and dirty" solution is to obfuscate the code being injected, but that adds a ton of overhead at runtime or at the very least, adds an annoying build process every time you make a change to your code, so clearly that's not optimal.

prescience-data commented 4 years ago

Upon testing this here https://github.com/prescience-data/prescience-data.github.io/blob/master/execution-monitor.html#L32 I no longer believe this is an issue.