joxeankoret / diaphora

Diaphora, the most advanced Free and Open Source program diffing tool.
http://diaphora.re
GNU Affero General Public License v3.0
3.62k stars 372 forks source link

diaphora crashes with IDA 7.6? #223

Closed Lakr233 closed 3 years ago

Lakr233 commented 3 years ago
Error: 'utf-8' codec can't decode bytes in position 1102-1103: invalid continuation byte
Traceback (most recent call last):
  File "C:/Users/qaq/Documents/diaphora-2.0.6\diaphora_ida.py", line 2371, in _diff_or_export
    bd.export()
  File "C:/Users/qaq/Documents/diaphora-2.0.6\diaphora_ida.py", line 846, in export
    self.do_export(crashed_before)
  File "C:/Users/qaq/Documents/diaphora-2.0.6\diaphora_ida.py", line 801, in do_export
    props = self.read_function(func)
  File "C:/Users/qaq/Documents/diaphora-2.0.6\diaphora_ida.py", line 1630, in read_function
    disasm = GetDisasm(x)
  File "C:\Users\qaq\Documents\IDA Pro 7.6\python\3\idc.py", line 1540, in GetDisasm
    return generate_disasm_line(ea, 0)
  File "C:\Users\qaq\Documents\IDA Pro 7.6\python\3\idc.py", line 1514, in generate_disasm_line
    text = ida_lines.generate_disasm_line(ea, flags)
  File "C:\Users\qaq\Documents\IDA Pro 7.6\python\3\ida_lines.py", line 664, in generate_disasm_line
    return _ida_lines.generate_disasm_line(*args)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 1102-1103: invalid continuation byte

Getting these error frequently with Chinese apps, any idea about fixing it that I can help with?

joxeankoret commented 3 years ago

It's odd. Can you share any IDA database/binary with which I can reproduce it? I haven't seen that error ever.

Lakr233 commented 3 years ago

I can provide you a binary. I'm currently using IDA 7.6 and newest python 3.

https://mega.nz/file/VPwSWL4S#0ox7j1z3-NA-aAxXRRqfMs5jWK62IFLG_BgSeVS_FUM

joxeankoret commented 3 years ago

Thank you! Looking into it...

joxeankoret commented 3 years ago

Err... the binary has "only" 700K functions. I "might" need quite some time to fix this thing...

joxeankoret commented 3 years ago

I'm trying to find a workaround, but it looks to me like you will need to talk to the Hex-Rays folks as simply doing this crashes:

GetDisasm(0x100E66890)
Lakr233 commented 3 years ago

Get it! Take your time bro.

btw, I’m also facing a analysis loop using IDA “intelligence” scan 🤪 and seems to be a bug at control flow flattening functions.

joxeankoret commented 3 years ago

I have added a workaround for this bug that works with the given binary. I believe it's an IDA Python bug but, while it isn't fixed, people can use this version that workarounds it.

joxeankoret commented 3 years ago

The workaround is in this commit: 55e7822322e72c2caa3b9cbf4f382f0542c6757e