FoxIO-LLC / ja4

JA4+ is a suite of network fingerprinting standards
https://foxio.io
Other
1.01k stars 88 forks source link

Missing JA4 fingerprints in output #136

Open elpy1 opened 4 months ago

elpy1 commented 4 months ago

Hi :wave: . While working on a personal project that implements JA4, I noticed some discrepancies when comparing JA4 (TCP) fingerprint output against some of the tls PCAP files in your repo.

For example, I get the following TLS fingerprints from tls-handshake.pcapng:

$ python pcap.py --file ~/git/ext/ja4/pcap/tls-handshake.pcapng | sort | uniq -c | sort -nr
     54 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

With ja4.py I get:

$ python ja4.py --ja4 ~/git/ext/ja4/pcap/tls-handshake.pcapng | grep -E -o 't\w{9}_\w{12}_\w{12}' | sort | uniq -c | sort -nr
     49 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

With tshark (TShark (Wireshark) 4.2.6 (Git commit fca52ffc018f).) I get:

$ tshark -r  ~/git/ext/ja4/pcap/tls-handshake.pcapng -Y 'tls.handshake.type == 1' -Tfields -e 'tls.handshake.ja4' | grep '^t' | sort | uniq -c | sort -nr
     54 t13d1516h2_8daaf6152771_e5627efa2ab1
      5 t13d1515h2_8daaf6152771_f37e75b10bcc
      3 t13d1516h1_8daaf6152771_e5627efa2ab1
      1 t13d1517h1_8daaf6152771_6cdcb247c39b
      1 t13d151400_8daaf6152771_de4a06bb82e3

Upon looking at this a bit further I realised the caching functionality in common.py is based on streams. So, if there is more than one fingerprint in a stream, it gets overwritten in the cache? Examples stream: image

I was able to resolve this locally by hacking together a change that uses a tuple containing the stream and frame number as the cache key, but this probably isn't suitable because it results in multiple outputs for a stream, instead of multiple fingerprints inside a single stream output.

john-althouse commented 3 months ago

Thanks for bringing this up! We should add any additional JA4s seen in streams to the output as JA4.2, etc. like how we do with JA4X I think. Would that work?

elpy1 commented 3 months ago

Considering the core functionality currently involves extracting fingerprints from each stream, that makes sense to me.

I'm simply grepping for the JA4 pattern, so it doesn't matter where it is in the output for my use-case. Thanks.