Open searchingforcode opened 2 years ago
Thank you for submitting this issue!
Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.
Thank you for submitting this issue!
Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.
https://playwright.dev/docs/api/class-worker#worker-evaluate isn't it?
Thank you for submitting this issue!
Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.
I think that is why many paid multilogin type service providers use their own modified chromium.
Thank you for submitting this issue! Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.
I think that is why many paid multilogin type service providers use their own modified chromium.
Shitty idea :)
@barjin @wireguard-dev It seems like the biggest issue here is the lack of init script
support in service workers. worker-evaluate isn't a good fit because it can only be called after a worker has been spawned, which is too late to override the object prototypes before the scripts get a chance to execute its fingerprinting logic. At best we end up in a race condition to try and inject custom scripts while the web worker is waiting.
In the past I've solved for this problem by intercepting the egress script requests that fetch the javascript backing the web worker. The browser / page context needs to fetch the script javascript before spawning a server worker, so these should still be routed through browserContext.route(url, handler[, options]). If we fetch the script from the desired remote and prepend the fingerprinting init script before it, we should get the desired object primitives before the service worker has a chance to run.
Since fingerprint-injector
already renders this init script as a single string payload (specifically here), it seems like all we need here would be:
page.route
within attachFingerprintToPlaywright
that filters for request.resourceType()
or request.serviceWorker()
. I'll have to check which one is present in these service worker loading requests.Does this sound like a reasonable route forward?
Thank you @piercefreeman (and sorry @wireguard-dev for taking this long :) )! Exactly as you say, the injection needs to take place before all other JS execution.
In fact, there is a (somewhat stale) branch here (see the last commit) following the exact approach you described. Feel free to merge it and try it out locally. This does perform "worker injection", but unfortunately doesn't capture ServiceWorker
or SharedWorker
requests (most likely due to Playwright limitations).
Without stable support from PW/PP, I cannot really give any estimates on this as of now. The partial support (for regular Workers only) still needs some testing (mainly performance-wise), but it could make it to npm somewhat soon :)
@barjin I wrote a MITM proxying library to work around chromium issues where it won't intercept some traffic. It should support that ServiceWorker/SharedWorker use case over both http and https. If you're interested in giving it a spin: https://github.com/piercefreeman/grooveproxy
@barjin I wrote a MITM proxying library to work around chromium issues where it won't intercept some traffic. It should support that ServiceWorker/SharedWorker use case over both http and https. If you're interested in giving it a spin: https://github.com/piercefreeman/grooveproxy
And for services that use a blob: what will you do? These are no longer requests
@wireguard-dev Can you elaborate a bit on what service blobs you're referring to?
The proxy server should capture all packets that are going out the network, though. Just like providing a third party proxy server in chromium or requests - it'll route all packets through the middle layer. So in theory it should be possible to manipulate them and mock them with fingerprint injection logic. If you have more info on the context I'll be able to be more specific.
@wireguard-dev Can you elaborate a bit on what service blobs you're referring to?
The proxy server should capture all packets that are going out the network, though. Just like providing a third party proxy server in chromium or requests - it'll route all packets through the middle layer. So in theory it should be possible to manipulate them and mock them with fingerprint injection logic. If you have more info on the context I'll be able to be more specific.
CloudFlare uses blob:
with SharedWorker. blob is like hard-encoded in main js file code of sharedworker.
@wireguard-dev This case gets us a bit outside the scope of the original thread, which was doing request interception over the network from shared workers.
For this particular case I think the answer is to pre-inject an override to the SharedWorker constructor, via the parent code that launches the shared worker in the first place. If a bytes payload is provided then it will prepend the custom logic to the base64 encoded payload. Either that or some future chromium or playwright support that allows you to inspect the script of the shared worker before processing begins.
Do you have a link to this cloudflare javascript?
@wireguard-dev This case gets us a bit outside the scope of the original thread, which was doing request interception over the network from shared workers.
For this particular case I think the answer is to pre-inject an override to the SharedWorker constructor, via the parent code that launches the shared worker in the first place. If a bytes payload is provided then it will prepend the custom logic to the base64 encoded payload. Either that or some future chromium or playwright support that allows you to inspect the script of the shared worker before processing begins.
Do you have a link to this cloudflare javascript?
Just check websites with cloudflare enabled js challenge. and you will see
@barjin any updates?
Any update?
Thank you everyone for the great ideas! (and sorry for taking so long, again). Long story short - injecting the browser data into workers using the base64 encoded payloads seems more than doable. I'll do some research and will get back to you all with the results.
As for the 'regular' workers, i.e. the ones initiating network requests - every way of injecting those (I can come up with) results in a performance downgrade. @piercefreeman 's proxy library looks great - make sure to check it out! - but unfortunately, we cannot use it in fingerprint-suite
just yet - we have to keep our dependencies slim. Our main downstream library crawlee
handles advanced networking on itself - and introducing another proxy manipulating the requests might easily result in some (potentially fatal) Mexican standoff situation :)
Any update?
Creepjs is rating F- (lowest). Could anyone share how to improve the rating?
const context = await newInjectedContext(browser, {
// Constraints for the generated fingerprint (optional)
// Playwright's newContext() options (optional, random example for illustration)
newContextOptions: {
geolocation: {
latitude: 51.50853,
longitude: -0.12574
}
}
})
const page = await context.newPage()
await page.goto("https://abrahamjuliot.github.io/creepjs/")
Thank you for submitting this issue!
Indeed, support for worker injection is in our scope right now. Unfortunately, injecting code in the (service/shared) worker environment is not as simple as doing it in the "plain browser" environment itself. We are closely monitoring the progress on both automation libraries and will add worker support as soon as it is possible.
I think that is why many paid multilogin type service providers use their own modified chromium.
Hi! Do you have any resources about how to modify chromium? I'm trying to build a software exactly like Multilogin
Hi the creepjs can detect real info using workers. Is there any solution for that?