Closed Onefivefournine closed 2 years ago
Thank you for recommending these. That's a lot of cool stuff. I will take a look.
Some early notes:
SerialPort.prototype
. Some APIs, such as SerialPort, are only available on desktop, but we could use AbortSignal.timeout
to capture Chrome 103 on both Android and desktop. Edit: I forgot I'm using Element.role
to detect Chrome 103.BatteryManager.charging
is beneficial for session based fingerprinting, similar to storage quota, heap memory, network speed, performance timing, viewport size and client triggered events. I might add a "Status" section focused on these.Hi buddy, I have been researching and exploring your repo every day and trying to explore and learn more so I can contribute and maybe work with u someday, Apart from that as per [Onefivefournine] I looked over and found a demo regarding finding the Js version mine is 1.7 as it states but I am curious What is the use of this enumeration in terms of bot detection?
jsfiddle link :- http://jsfiddle.net/Ac6CT/
I am curious regarding your current research + is there a way to be in contact with u and your research?
maybe like a discord channel?
The spec I was talking about https://mimesniff.spec.whatwg.org/#javascript-mime-type Currently I see Chrome 103 returns 1.7, but Firefox 102 returns 1.5, also tested on Chromium 105 - it returns 1.5
As for serialPort: like some other techniques it only works on https origin
The spec I was talking about https://mimesniff.spec.whatwg.org/#javascript-mime-type Currently I see Chrome 103 returns 1.7, but Firefox 102 returns 1.5, also tested on Chromium 105 - it returns 1.5
As for serialPort: like some other techniques it only works on https origin
I understood what u meant u on telegram? or discord, I need a friendly guy to like research around as curiosity
@vis2021t That's cool you are interested in this stuff. You might like this manuscript and paper. A lot of research there.
I am mostly active on GitHub. Feel free to reach out to me either in this repo or anything open source.
@vis2021t That's cool you are interested in this stuff. You might like this manuscript and paper. A lot of research there.
I am mostly active on GitHub. Feel free to reach out to me either in this repo or anything open source.
Thanks buddy I ran your git hosted website over Google mobile web friendly check and and they were easily getting detected as lying, I noticed specifically it was based over worker where they got caught lying
Few images I took:-
Just FYI, I will add new techniques in this issue that I found researching some trackers. You can implement them or can dump them, it is your repo, just want to share some info :)
new techniques
That is awesome. New techniques are always appreciated. I will check them out.
Google mobile web friendly check
Looks like Linux emulating Android. Definitely a good Googlebot. TLS fingerprinting is a good way to determine if behavior like that is a real Googlebot. Very interesting topic.
new techniques
That is awesome. New techniques are always appreciated. I will check them out.
Google mobile web friendly check
Looks like Linux emulating Android. Definitely a good Googlebot. TLS fingerprinting is a good way to determine if behavior like that is a real Googlebot. Very interesting topic.
new techniques
That is awesome. New techniques are always appreciated. I will check them out.
Google mobile web friendly check
Looks like Linux emulating Android. Definitely a good Googlebot. TLS fingerprinting is a good way to determine if behavior like that is a real Googlebot. Very interesting topic.
indeed I will be researching more over real days bot and detection, but I have noticed 1 thing in common, Worker scope lie is the place they get detected over these bots as they pretend to be real device, instead of emulated, do u have resources on tls fingerprinting, as u mentioned It can be great help in these section as I assume
worker scope
Worker scope in Chrome detects a lot too. In Chrome, device emulation alters the dedicated worker scope, but not shared or service worker scopes. Service workers have better support and run faster, so we can detect a lot there, but some bots try to fudge values in service workers too... requires JS tampering, so we try to detect that. If workers are blocked, that can be a helper in generating a more unique fingerprint. Blocking often creates a better fingerprint when few browsers are doing the same dance.
As far as I know, in Firefox, the device emulator does not affect the worker scope. But, ideally we want to use the dedicated worker scope since service workers can be disabled by a preference. The current method I use is to try service, then shared, and lastly dedicated. But, if we have to use dedicated, Chrome could emulate the user agent and platform. Regardless of the worker scope type, we try to validate the values and look for tampering.
Hi guys sorry for being inactive. haha looks like missed some cool stuff
Confirm me one thing, Are all industry type bots based over web-driver
for example google bot, yahoo bot etc
I'm not sure on the industry standard, but I imagine Google and Yahoo could be using both WebDriver and command line scripts.
I'm not sure on the industry standard, but I imagine Google and Yahoo could be using both WebDriver and command line scripts.
Mind to elaborate regarding cli scripts as I quite unclear with it ++ I think u are right about they using web driver
I found many useful bugs for more deeper look but I will share as soon as I make them good enough to represent
cli scripts
Here are a few interesting concepts
https://github.com/gocolly/colly/issues/4#issuecomment-334728237
Hey, take a look at these articles, fingerprinting is a science, haha https://incolumitas.com/pages/TCP-IP-Fingerprint/ https://incolumitas.com/pages/TLS-Fingerprint/
Very nice. I was just recently looking at the repo too. Didn't see the demo page till now.
Following your updates in the post too. The ttl-ping looks very interesting.
HTML entities/unicodes are a treasure. They also render using OS fonts and if measured are unique to OS versions.
Added performance.memory, storage quota, battery info, and network info to the status section. I might re-add the keyboard API. It was a performance hog and entropy seemed low.
It's on my mind to dissect Selenium.
I have some browser engine and version detection concepts, but not sure its necessary since this is revealed in so many areas. It's pretty much impossible to hide the engine and version without causing a mountain of site breakage. Check out https://arkenfox.github.io/TZP/tests/engine.html.
SerialPort
and matchMedia
, allowedFeatures
, etc: These are good detections. Currently, we can detect Chrome 103, 104, etc. and all versions with CSS and window features. It's okay if the detection is not exact, since we should allow some mild discrepancy: users can test features or disable them. canShare
: can be used, but there are some false positives. For example, headless on Windows and headfull on Linux have it disabled, and there is a ticket to support it on Mac (I'm not sure if it will get resolved).selenium
: will revisit at some point. Looking for uncommon detections.quota
, battery
and performance.memory
: added to status section. webkitTemporaryStorage
is depreciated as of Chrome 106, so I used the Storage API as a default.ping
: very interesting. Similar to TLS fingerprinting, this gets more into server-side analysis. I'm not sure if I will put focus on that for this project. More interested in front end fingerprinting.ip blacklist
: this should not be necessary, since the goal is not to create a fingerprinting library for fraud prevention. Rather, the goal is to promote research and education on browser fingerprinting. All network IPs are welcome, but I do have an API rate-limiting algorithm to auto timeout networks that generate too many requests in a given hour.Closing for now, but I will revisit some of these techniques.
I don't have enough data on platform/device differences on these apis, but I know they are used to deep fingerpint client. Please take a look, and hoping to see them implemented (if not already):
window.SerialPort.prototype.hasOwnProperty('forget')
- forget method only in chromium 103+window.matchMedia('(blah blah)').media === 'not all'
- false for media-queries level 4 (used in chromium 104+)window.matchMedia('(color-gamut: p3)').matches
andwindow.matchMedia('(color-gamut: rec2020)').matches
- support of modern color-gamutswindow.cdc_%randomHash%_Array
(or _Promise, or_Symbol) - property that can be found in unpatched, selenium controlled chromedriver (see https://github.com/ultrafunkamsterdam/undetected-chromedriver)await navigator.getBattery()
- batteryAPIdocument.featurePolicy.allowedFeatures()
versions 1.6 and 1.7 was deprecated by spec, but still can be found
window.console.memory.jsHeapSizeLimit
- may differ between usersawait navigator.keyboard.getLayoutMap()
- use.entries()
or.keys()
for hashingnavigator.webkitTemporaryStorage.queryUsageAndQuota((usage,quota)=>{console.log(usage,quota)})