cea-sec / miasm

Reverse engineering framework in Python
https://miasm.re/
GNU General Public License v2.0
3.46k stars 473 forks source link

x86_64 Cannot disasm mov with segment #1375

Closed KKomarov closed 3 years ago

KKomarov commented 3 years ago

Failed to disasm mov qword ptr es:[rax], rdx (26 48 48 89 10 in hex) Tested on recent master https://github.com/cea-sec/miasm/commit/42528eff158475d1eca930b06754003242d5f1c1

from miasm.analysis.machine import Machine

machine = Machine('x86_64')
machine.mn.dis(bytearray.fromhex('26 48 48 89 10'), 64)

Traceback (most recent call last):
  File "<input>", line 3, in <module>
  File "..\miasm-repo\miasm\core\cpu.py", line 1276, in dis
    raise Disasm_Exception('cannot disasm at %X' % offset_o)
miasm.core.utils.Disasm_Exception: cannot disasm at 0
serpilliere commented 3 years ago

Hi @KKomarov If I read correctly your assebmyl, it's:

26 48 48 89 10 

which is composed of:

Maybe I am wrong here, but I think you can have only one (at maximum) REX prefix, so this instruction seems not to be legit as it's composed of two REX.

Have I missed something @KKomarov ?

serpilliere commented 3 years ago

Hum, I just get this on https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefix:

The use of multiple REX prefixes is undefined, although processors seem to use only the last REX prefix. 

The current Miasm implementation only take one prefix. Maybe we could to some modifications in order to interpret the last REX in case of multiple REX

KKomarov commented 3 years ago

@serpilliere I just take this from real binary. It seems like a part of anti-analysing feature. I tested Ida and Zydis they treat this fine

KKomarov commented 3 years ago

@serpilliere Is it an easy fix? If you can tell what should be done I can make a PR

serpilliere commented 3 years ago

Hi @KKomarov yep: the code here: https://github.com/cea-sec/miasm/blob/master/miasm/arch/x86/arch.py#L754 takes the last byte read, (which is not a prefix, so may be a rex) and if this is a rex it modify the instruciton value. Maybe it's around there. We may modify this to do a second loop and parse while it's a REX.

If you want to try the modification, go for it @KKomarov. If not, I may look t this by the end of the week

KKomarov commented 3 years ago

Fixed