danielplohmann / smda

SMDA is a minimalist recursive disassembler library that is optimized for accurate Control Flow Graph (CFG) recovery from memory dumps.
BSD 2-Clause "Simplified" License
228 stars 36 forks source link

Windows: struct.error: argument out of range #20

Closed mr-tz closed 3 years ago

mr-tz commented 3 years ago

I'm not sure I was able to locate the right error source here, but running this on Windows we get two exceptions in capa, see https://github.com/fireeye/capa/pull/470/checks?check_run_id=2277082296#step:7:28

================================== FAILURES ===================================
_ test_smda_features[al-khaser x64-function=0x14004B4F0-api(__vcrt_GetModuleHandle)-True] _

ins = <smda.common.SmdaInstruction.SmdaInstruction object at 0x0000025A8DC23550>

    @staticmethod
    def escapeBinaryPtrRef(ins):
        escaped_sequence = ins.bytes
        addr_match = re.search(r"\[(rip (\+|\-) )?(?P<dword_offset>0x[a-fA-F0-9]+)\]", ins.operands)
        if addr_match:
            offset = int(addr_match.group("dword_offset"), 16)
            if "rip -" in ins.operands:
                offset = 0x100000000 - offset
            #TODO we need to check if this is actually a 64bit absolute offset (e.g. used by movabs)
            try:
>               packed_hex = str(codecs.encode(struct.pack("I", offset), 'hex').decode('ascii'))
E               struct.error: argument out of range

c:\hostedtoolcache\windows\python\3.9.2\x64\lib\site-packages\smda\intel\IntelInstructionEscaper.py:334: error

During handling of the above exception, another exception occurred:

self = <smda.Disassembler.Disassembler object at 0x0000025A97BF9310>
file_path = 'D:\\a\\capa\\capa\\tests\\data\\al-khaser_x64.exe_', pdb_path = ''

    def disassembleFile(self, file_path, pdb_path=""):
        loader = FileLoader(file_path, map_file=True)
        file_content = loader.getData()
        binary_info = BinaryInfo(file_content)
        binary_info.raw_data = loader.getRawData()
        binary_info.file_path = file_path
        binary_info.base_addr = loader.getBaseAddress()
        binary_info.bitness = loader.getBitness()
        binary_info.code_areas = loader.getCodeAreas()
        start = datetime.datetime.utcnow()
        try:
            self.disassembler.addPdbFile(binary_info, pdb_path)
>           smda_report = self._disassemble(binary_info, timeout=self.config.TIMEOUT)

c:\hostedtoolcache\windows\python\3.9.2\x64\lib\site-packages\smda\Disassembler.py:52: 
...
ERROR    smda.Disassembler:Disassembler.py:56 An error occurred while disassembling file.
_ test_smda_features[a1982...-function=0x4014D0-characteristic(cross section flow)-True] _

ins = <smda.common.SmdaInstruction.SmdaInstruction object at 0x0000025A9AA2A9A0>

    @staticmethod
    def escapeBinaryPtrRef(ins):
        escaped_sequence = ins.bytes
        addr_match = re.search(r"\[(rip (\+|\-) )?(?P<dword_offset>0x[a-fA-F0-9]+)\]", ins.operands)
        if addr_match:
            offset = int(addr_match.group("dword_offset"), 16)
            if "rip -" in ins.operands:
                offset = 0x100000000 - offset
            #TODO we need to check if this is actually a 64bit absolute offset (e.g. used by movabs)
            try:
>               packed_hex = str(codecs.encode(struct.pack("I", offset), 'hex').decode('ascii'))
E               struct.error: argument out of range

c:\hostedtoolcache\windows\python\3.9.2\x64\lib\site-packages\smda\intel\IntelInstructionEscaper.py:334: error

During handling of the above exception, another exception occurred:

self = <smda.Disassembler.Disassembler object at 0x0000025A9B4FFAC0>
file_path = 'D:\\a\\capa\\capa\\tests\\data\\a198216798ca38f280dc413f8c57f2c2.exe_'
pdb_path = ''

    def disassembleFile(self, file_path, pdb_path=""):
        loader = FileLoader(file_path, map_file=True)
        file_content = loader.getData()
        binary_info = BinaryInfo(file_content)
        binary_info.raw_data = loader.getRawData()
        binary_info.file_path = file_path
        binary_info.base_addr = loader.getBaseAddress()
        binary_info.bitness = loader.getBitness()
        binary_info.code_areas = loader.getCodeAreas()
        start = datetime.datetime.utcnow()
        try:
            self.disassembler.addPdbFile(binary_info, pdb_path)
>           smda_report = self._disassemble(binary_info, timeout=self.config.TIMEOUT)

c:\hostedtoolcache\windows\python\3.9.2\x64\lib\site-packages\smda\Disassembler.py:52: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <smda.Disassembler.Disassembler object at 0x0000025A9B4FFAC0>
binary_info = <smda.common.BinaryInfo.BinaryInfo object at 0x0000025A9B4FFE20>
timeout = 300

    def _disassemble(self, binary_info, timeout=0):
        self._start_time = datetime.datetime.utcnow()
        self._timeout = timeout
        self.disassembly = self.disassembler.analyzeBuffer(binary_info, self._callbackAnalysisTimeout)
>       return SmdaReport(self.disassembly, config=self.config)

c:\hostedtoolcache\windows\python\3.9.2\x64\lib\site-packages\smda\Disassembler.py:101: 
...
danielplohmann commented 3 years ago

The second one is definitely addressed but I did not manage to reproduce the first one. :-/ I have a rough idea under which circumstances it might occur but could not even have it run into that error with a fresh Python under Windows. :( It also does not seem to appear on Linux at all, which is pretty weird.

mr-tz commented 3 years ago

Yes, really weird indeed! Thanks for the quick fix, will let you know if we find out more about this.