cloudflare / mitmengine

A MITM (monster-in-the-middle) detection tool. Used to build MALCOLM:
https://malcolm.cloudflare.com
BSD 3-Clause "New" or "Revised" License
806 stars 68 forks source link

Consider implementing JA3 #14

Open atkinsj opened 5 years ago

atkinsj commented 5 years ago

Hey folks,

It sounds like you're collecting all the same information as JA3. If you're looking for fingerprint databases implementing JA3 may give you a leg-up in both the public and private community, for example this list of macOS and Linux applications correlated to their TLS Client Hellos.

lukevalenta commented 5 years ago

Absolutely! I did look at JA3 when coming up with the signature/fingerprint formats, but wanted to go with something that allowed for fuzzy matching more easily and could incorporate all of the logic used in the original research project (e.g., allowing a fingerprint to match a signature even if some cipher suites are out of order or omitted). Is there a good solution for this with JA3?

When I have some more time on my hands, improving the fingerprint and signature formats are a priority, and suggestions are welcome.

Regardless, being able to import JA3 fingerprints would be nice and shouldn't be too hard to implement. I'll leave this issue open until I or someone else has the time to work on it.

atkinsj commented 5 years ago

Yeah; there's no fuzzy matching with JA3 which is not the greatest in hindsight. MD5 was chosen so it could be fed into common IOC and vendor feeds since they all already understand it. What's the use case for client hellos where the ciphers are the same but in different orders: doesn't this fundamentally represent two different clients? This was our thinking with JA3, curious to hear if you know of software that randomises the order of the same cipher set.

If I get time I'll take a look at writing an import function as well. I think there's a lot of potential benefit because we've done a heap of work joining JA3 with process event logs (OSQuery on macOS/Linux, Sysmon on Windows) which means we can go JA3->Process on a host. This is a distinctly different use case than what you're trying to achieve -- looking for MITM proxies -- but could help with software categorisation with on-host proxies (e.g., TrickBot) for you.

lukevalenta commented 5 years ago

I think you're right that a particular client will always generate the same TLS fingerprint (as long as the fingerprint ignores things like GREASE ciphers), but a particular user agent can be produced by a lot of different clients. For example, Chrome changes the order of preferred cipher suites depending on the presence of AES-NI, so you might get several different TLS fingerprints for a particular Chrome user agent depending on client's available hardware.

It looks like the two main approaches are 1) build small number of signatures (fuzzy fingerprints) for each user agent based on generalizing observed fingerprints or manually inspecting software (MITMEngine approach), or 2) collect many fingerprints for each user agent and perform exact matching against incoming requests (JA3 approach).

However, since approaches are complementary we could first match an incoming request fingerprint against known JA3 fingerprints, and then check to see if it matches any known signatures.

It would be really nice to import JA3 fingerprints. Keep me posted if you find the time to work on this, and I'll do the same!