abrahamjuliot / creepjs

Creepy device and browser fingerprinting
MIT License
1.56k stars 194 forks source link

New detecting techniques #186

Closed Onefivefournine closed 2 years ago

Onefivefournine commented 2 years ago

I don't have enough data on platform/device differences on these apis, but I know they are used to deep fingerpint client. Please take a look, and hoping to see them implemented (if not already):

abrahamjuliot commented 2 years ago

Thank you for recommending these. That's a lot of cool stuff. I will take a look.

Some early notes:

vis2021t commented 2 years ago

Hi buddy, I have been researching and exploring your repo every day and trying to explore and learn more so I can contribute and maybe work with u someday, Apart from that as per [Onefivefournine] I looked over and found a demo regarding finding the Js version mine is 1.7 as it states but I am curious What is the use of this enumeration in terms of bot detection?

jsfiddle link :- http://jsfiddle.net/Ac6CT/

I am curious regarding your current research + is there a way to be in contact with u and your research?

maybe like a discord channel?

Onefivefournine commented 2 years ago

The spec I was talking about https://mimesniff.spec.whatwg.org/#javascript-mime-type Currently I see Chrome 103 returns 1.7, but Firefox 102 returns 1.5, also tested on Chromium 105 - it returns 1.5

As for serialPort: like some other techniques it only works on https origin

vis2021t commented 2 years ago

The spec I was talking about https://mimesniff.spec.whatwg.org/#javascript-mime-type Currently I see Chrome 103 returns 1.7, but Firefox 102 returns 1.5, also tested on Chromium 105 - it returns 1.5

As for serialPort: like some other techniques it only works on https origin

I understood what u meant u on telegram? or discord, I need a friendly guy to like research around as curiosity

abrahamjuliot commented 2 years ago

@vis2021t That's cool you are interested in this stuff. You might like this manuscript and paper. A lot of research there.

I am mostly active on GitHub. Feel free to reach out to me either in this repo or anything open source.

vis2021t commented 2 years ago

@vis2021t That's cool you are interested in this stuff. You might like this manuscript and paper. A lot of research there.

I am mostly active on GitHub. Feel free to reach out to me either in this repo or anything open source.

Thanks buddy I ran your git hosted website over Google mobile web friendly check and and they were easily getting detected as lying, I noticed specifically it was based over worker where they got caught lying

vis2021t commented 2 years ago

Few images I took:-

vis2021t commented 2 years ago

Screenshot_20220711-000414_Kiwi Browser Screenshot_20220711-000440_Kiwi Browser

Onefivefournine commented 2 years ago

Just FYI, I will add new techniques in this issue that I found researching some trackers. You can implement them or can dump them, it is your repo, just want to share some info :)

abrahamjuliot commented 2 years ago

new techniques

That is awesome. New techniques are always appreciated. I will check them out.

Google mobile web friendly check

Looks like Linux emulating Android. Definitely a good Googlebot. TLS fingerprinting is a good way to determine if behavior like that is a real Googlebot. Very interesting topic.

vis2021t commented 2 years ago

new techniques

That is awesome. New techniques are always appreciated. I will check them out.

Google mobile web friendly check

Looks like Linux emulating Android. Definitely a good Googlebot. TLS fingerprinting is a good way to determine if behavior like that is a real Googlebot. Very interesting topic.

new techniques

That is awesome. New techniques are always appreciated. I will check them out.

Google mobile web friendly check

Looks like Linux emulating Android. Definitely a good Googlebot. TLS fingerprinting is a good way to determine if behavior like that is a real Googlebot. Very interesting topic.

indeed I will be researching more over real days bot and detection, but I have noticed 1 thing in common, Worker scope lie is the place they get detected over these bots as they pretend to be real device, instead of emulated, do u have resources on tls fingerprinting, as u mentioned It can be great help in these section as I assume

abrahamjuliot commented 2 years ago

worker scope

Worker scope in Chrome detects a lot too. In Chrome, device emulation alters the dedicated worker scope, but not shared or service worker scopes. Service workers have better support and run faster, so we can detect a lot there, but some bots try to fudge values in service workers too... requires JS tampering, so we try to detect that. If workers are blocked, that can be a helper in generating a more unique fingerprint. Blocking often creates a better fingerprint when few browsers are doing the same dance.

As far as I know, in Firefox, the device emulator does not affect the worker scope. But, ideally we want to use the dedicated worker scope since service workers can be disabled by a preference. The current method I use is to try service, then shared, and lastly dedicated. But, if we have to use dedicated, Chrome could emulate the user agent and platform. Regardless of the worker scope type, we try to validate the values and look for tampering.

vis2021t commented 2 years ago

Hi guys sorry for being inactive. haha looks like missed some cool stuff

Confirm me one thing, Are all industry type bots based over web-driver

for example google bot, yahoo bot etc

abrahamjuliot commented 2 years ago

I'm not sure on the industry standard, but I imagine Google and Yahoo could be using both WebDriver and command line scripts.

vis2021t commented 2 years ago

I'm not sure on the industry standard, but I imagine Google and Yahoo could be using both WebDriver and command line scripts.

Mind to elaborate regarding cli scripts as I quite unclear with it ++ I think u are right about they using web driver

I found many useful bugs for more deeper look but I will share as soon as I make them good enough to represent

abrahamjuliot commented 2 years ago

cli scripts

Here are a few interesting concepts

Onefivefournine commented 2 years ago

Hey, take a look at these articles, fingerprinting is a science, haha https://incolumitas.com/pages/TCP-IP-Fingerprint/ https://incolumitas.com/pages/TLS-Fingerprint/

abrahamjuliot commented 2 years ago

Very nice. I was just recently looking at the repo too. Didn't see the demo page till now.

Following your updates in the post too. The ttl-ping looks very interesting.

HTML entities/unicodes are a treasure. They also render using OS fonts and if measured are unique to OS versions.

image

Added performance.memory, storage quota, battery info, and network info to the status section. I might re-add the keyboard API. It was a performance hog and entropy seemed low.

It's on my mind to dissect Selenium.

I have some browser engine and version detection concepts, but not sure its necessary since this is revealed in so many areas. It's pretty much impossible to hide the engine and version without causing a mountain of site breakage. Check out https://arkenfox.github.io/TZP/tests/engine.html.

abrahamjuliot commented 2 years ago

Closing for now, but I will revisit some of these techniques.