airbus-seclab / cpu_rec

Recognize cpu instructions in an arbitrary binary file
Apache License 2.0
657 stars 60 forks source link

'Key error' exception when using as binwalk module #1

Closed malikcjm closed 6 years ago

malikcjm commented 6 years ago

I tried to use cpu_rec from master branch and commit f1934bc888067c716f39b8856e48ebe99a73aa78 'Display the value of Shannon entropy when used as a binwalk module: this helps to recognise false detection' as binwalk module. And every time I tried such exception was throw

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------

CPUStatisticalDiscovery Exception: 96
----------------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/binwalk/core/module.py", line 597, in main
    retval = self.run()
  File "/home/zbroka/.config/binwalk/modules/cpu_rec.py", line 741, in run
    self.scan_file(fp)
  File "/home/zbroka/.config/binwalk/modules/cpu_rec.py", line 760, in scan_file
    self.shannon(raw[pos:pos+cnt])))
  File "/home/zbroka/.config/binwalk/modules/cpu_rec.py", line 774, in shannon
    seen[byte] += 1
KeyError: 96
----------------------------------------------------------------------------------------------------

Based on usage of seen dictionary I've made this change to make it work.

diff --git a/cpu_rec.py b/cpu_rec.py
index 5491417..c9ea2eb 100755
--- a/cpu_rec.py
+++ b/cpu_rec.py
@@ -771,7 +771,7 @@ try:
             length = len(data)
             seen = dict(((chr(x), 0) for x in range(0, 256)))
             for byte in data:
-                seen[byte] += 1
+                seen[chr(byte)] += 1
             for x in range(0, 256):
                 p_x = float(seen[chr(x)]) / length
                 if p_x > 0:
LRGH commented 6 years ago

This function shannon is copied from binwalk, and should work for python2 and python3. My mistake is that in binwalk the argument data is always of type str while here it is of type str in python2 and data in python3. With your patch, it does not work with python2 anymore.

Thank you for reporting this bug of cpu_rec.