Open vis2021t opened 2 years ago
I mainly look at the Chromium source, but not as much as I should. It depends on what type of task I am facing. Recently, I was looking for documentation on why or when the WebGL renderer string stopped reporting the graphics drivers version information. I noticed a likely fake result that continued to report the version, but I'm confident that Chrome no longer includes it.
Chromium research
Hmm, I understood.
I will be looking on finding more core points for detections
for now
as it's kinda fun to me
Hey buddy, I think We should kinda increase the options at platform detection in Headless
I'm talking about aerial
I used 3 bot to check on creepjs and cloned the resulted html
all said 100 on Linux and 100 on window I am sure they were Linux headless as shared worker said it in another test upon /fpworker I mean the max I can conclude is we can be aware if its an android or pc
but don't know if it's windows or Linux because both aerial is 100
There might be a Web API we can use to distinguish Linux from Windows. As far as I know, on Chrome, Windows typically uses Arial
and Segoe UI
, but this pair is not exclusive to Windows. There are a few key features that set Chrome OS, macOS, and Windows/Linux apart from each other. The hints are essentially feature detection under the hood. However, these can be easily spoofed.
We can expect these to be faked by clever scripts, and can use this as a trap to catch them. If a script attempts to emulate Android features on Desktop, it will create a better fingerprint by causing a unique window hash with an unusual re-ordering of properties. In that case, we may lose out on a useful platform hint, but we will have identified suspicious activity.
These features are subject to change, too, so we can't rely on them too heavily. In some cases, it's all right if we are not aware of the real platform. Ultimately, we just need a few unique identifiers that can tell apart unusual web traffic from normal traffic. There are many subtle fingerprints that get overlooked. CSS match media, for example, can identify devices with no mouse or touch input (keyboard-only controls).
Hi buddy was busy for a while will be comming back to research from tomorrow
Found this https://nullpt.rs/author/veritas. Interesting articles.
really interesting i do agree
hmm I was using a famous plugin "Dark reader"
it add attribute in html :-
and yea sorry I was busy with some work I will be free now
It's all good.
Dark Reader is great. That's a good detection, too. It can be a human indicator if it is on. Something like this, maybe.
That's a good detection, too. It can be a human indicator if it is on.
True
I use dark reader all the time was working on a website so I saw it while debugging haha, will look for more interesting plugins which may leak some things over documents etc
Hi, I was looking around gmail and I saw the are able to detect a secure or a suspicious browser, somewhat like we do at creepjs. But I am curious with their mechanism. I saw it after when we enter gmail address there is a detection script there. If browser is ok or not ( including bot detection ), It's always good to take inspirations haha
Wanna explore together?
Sure, I imagine they use UA client hints to detect unseen devices and then warn backup email of unknown device log in to x account. The difficulty is de-obfuscating their code. This repo has a lot we can also look at.
Sure finally ur back haha , kinda missed us.
anyway I think gmail uses something more complex
even puppeteer stealth can't get in login even in normal like same useragent etc without headless written there
I think that's why I want us to see what's intresting there
when u were inactive I was learning over dev tools detection from this repo :- https://github.com/AEPKILL/devtools-detector and I tested it, it's working smooth with detections
but for now I'm really more interested in gmail detection
Because of the above reason
that's why I got interested maybe there can be something more we could learn ? who knows
Sure This repo has a lot we can also look at.
Damm that repo, I can sense some awesome thing right there
Is the obfuscators absolutely foolproof? No, while it's impossible to recover the exact original source code, someone with the time, knowledge and patience can reverse-engineer it.
Since the JavaScript runs on the browser, the browser's JavaScript engine must be able to read and interpret it, so there's no way to prevent that. And any tool that promises that is not being honest.
-- mentioned in https://obfuscator.io/#FAQ
one of the best obfuscator I have seen till yet
lol this is exactly what we needed
devtools-detector
Nice. I ran into that recently. That's a good detection.
Good points. The Googlebot code looks like a challenge. I can see it collects the error stack here.
devtools-detector
Good points. The Googlebot code looks like a challenge. I can see it collects the error stack here.
Agreed, I am working over a small project rn which includes me to use ejs and express and a cdn of maybe vuejs, react native or any front end framework.
I literally learned all 3 ( vue, angular and react ) within 5 days. u can imagine it's been a mind blowing week for me Vuejs and React meet upto my requirements I will be completed with work day after tomorrow
will start over looking googlebot one probably day after tomorrow.
haaah ~ sigh in tiredness ~
Hii , I'm done with my project.
Let's research 💝
I'm gonna look at the Google botgaurd. any information u discovered? maybe?
I found something, I even opened a issue as research the owner is kinda active too I noticed now so
that's the latest code of Google botgaurd reverse attempt:-
https://github.com/icetroll/botguard-RE
we can learn from here
Nice. That is a lot of code. I think it has to do with behavioral fingerprints. I see a few event listeners connected to DOM elements.
I've been researching ways to detect Selenium and found some interesting leaks. Fascinating article here. Those values seem to be manipulated by different bots, but the object prototype contains unique keys that are important to the internal code. I haven't tested it, but I think it's possible to override those functions with eval code and use them to get internal values.
Naughty Eval
Very well I see now, can it also be refer as a info disclosure? If it works properly as we expect it to be, I am looking into the google bot code pattern detection (it is interesting but really nested), and also looking at the previous code challenge of google bot
The prototype functions might only reveal Selenium code and possibly different versions of the code.
The prototype functions might only reveal Selenium code and possibly different versions of the code.
That too will be really interesting for creepjs. I am sure, maybe a sure bot detection haha.
Rn I am giving names to the code of g-botgaurd to like understand it's working
I have understood quite much about Google botgaurd, I will give u a summary properly here it is intresting ngl
Any update over ur research?
Nothing yet. But, a lot on my mind. I think the storage bytes are an incredible high entropy fingerprint in Chrome. It depends on the machine and what it's used for, but if there are no changes in storage, the fingerprints can categorize a machine in 1 trillion possible fingerprints (to put it lightly). In private tabs, chromium reduces entropy (unstable per session and low bytes available).
Unrelated, I have this idea I might experiment with at some point. It's essentially a soft/superfast fingerprinting (less than 10ms and mostly low entropy), then it progressively slows down and expands into high entropy if anomalous hashes are detected. The idea is to make bad fingerprints move more slowly and good fingerprints move more quickly.
I looked over current gmail working, I found that they are monitoring and using the performance api very well, which I didnt knew thought of I am exploring more but I saw the new v3 is
I exploring other's antibot and monitoring behavior to expand creepjs , rn I am seeing this:-
seeing their website it's interesting how they use api's and clever javascript ( what is more interesting is that they have mentioned in their code as comment that those codes were written in the year 2016 if they are not lying it's quite fascinating ), till yet I am seeing and writing what api they are using then I will summarize things here as I go
I am resuming my research summarizing from today let's see I can put up some intresting points
about chromedriver detection, check https://github.com/HMaker/HMaker.github.io/tree/master/selenium-detector
most of the tests can be easily bypassed by patching chromedriver src though.
chromedriver detection
Very nice detection, there.. Can these functions be patched or removed? The functions names can be modified, but wouldn't the prototype still leak the names.
You can also change the prototype completely, also you could make chromedriver store that on window instead of document.
chromedriver is just a CDP wrapper, but it sits at higher level of chromium architecture, so they use the page global JS state to store automation related vars.
Gmail stuff summary
They use proxy detection (mostly based on performance api ) + worker is their focus just like we have here + they do have few basic feature detection and with err detection and buckets etc and rest it's just they made them lengthy
Chromedriver Detector
Detected!
about chromedriver detection
get a hug dude lol it's a good repo great effort really loved it
I was thinking to challenge myself against creepjs techniques hehe
I looked over the tls fingerprinting, You talked about but there is something I read at akamai research where they stated that bot are able to bypass to get on gud side :- https://www.akamai.com/blog/security/bots-tampering-with-tls-to-avoid-detection
I came across a 2 step tls fingerprinting but I lost that pdf 🥲🥲 dammit
Will try to find it but do u know about it?