viper-framework / viper-modules

BSD 3-Clause "New" or "Revised" License
6 stars 11 forks source link

calculate_pehash throws exception for PE+ file format #5

Open kohnakagawa opened 4 years ago

kohnakagawa commented 4 years ago

According to the current implementation of calculate_pehash, "pad to 16 bits" https://github.com/viper-framework/viper-modules/blob/d21d555ccc86e3d3e2bbc823902e394357d1ab83/pehash/pehasher.py#L25-L28 is not properly performed as its comment. If the value of exe.FILE_HEADER.Characteristics is 0x22 (e.g., PE+ EXE), upper 8bits data cannot be accessed, so "ValueError: Bitstrings must have the same length for ^ operator." exception is thrown as follows

                #pad to 16 bits
                img_chars = bitstring.BitArray(bytes=img_chars.tobytes())
-->             img_chars_xor = img_chars[0:8] ^ img_chars[8:16]

/usr/local/lib/python3.5/dist-packages/bitstring-3.1.5-py3.5.egg/bitstring.py in __xor__(self, bs)
   1128         bs = Bits(bs)
   1129         if self.len != bs.len:
-> 1130             raise ValueError("Bitstrings must have the same length "
   1131                              "for ^ operator.")
   1132         s = self._copy()

ValueError: Bitstrings must have the same length for ^ operator.

I think these lines should be fixed as follows.

        #image characteristics
        img_chars = bitstring.BitArray(hex(exe.FILE_HEADER.Characteristics))
        #pad to 16 bits
        # img_chars = bitstring.BitArray(bytes=img_chars.tobytes()) # <- this line do not do 16bits padding
        img_chars = img_chars.bin.zfill(16) # <- correct 16bits padding
        img_chars_xor = img_chars[0:8] ^ img_chars[8:16]

Is this an intended behavior of calculate_pehash function?