VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.26k stars 1.44k forks source link

Index of Coincidence #1907

Open BitsOfBinary opened 1 year ago

BitsOfBinary commented 1 year ago

The index of coincidence "provides a measure of how likely it is to draw two matching letters by randomly selecting two letters from a given text" (reference: https://en.wikipedia.org/wiki/Index_of_coincidence).

As we know this index will be around 0.067 for English text, this may be a useful indicator for plaintext strings. An example usage can be seen here: https://gchq.github.io/CyberChef/#recipe=Index_of_Coincidence()&input=VGhpcyBhIHRlc3QgZm9yIHRoZSBpbmRpY2F0b3Igb2YgY29pbmNpZGVuY2U

I've added this function for both data offsets/string values to the math module. I've also added documentation and tests for it.

BitsOfBinary commented 1 year ago

Updated for compatibility with YARA v4.3.2; tests passing as expected.