gdabah / distorm

Powerful Disassembler Library For x86/AMD64
Other
1.26k stars 238 forks source link

MOVSX ? UNDEFINED #86

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 9 years ago
In what mode did you try to disassemble (16/32/64)?
64
What is the input buffer (binary stream) you used to reproduce the problem?
4b6314b8

What is the expected output (or what instruction)?
MOVSX

Which tool did you use to see the expected output?
$ python
Python 2.7.9 (default, Feb  9 2015, 19:46:45) 
[GCC 4.8.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import distorm3
>>> distorm3.Decompose(0x1123, '4b6314b8'.decode('hex'), 2)[0].mnemonic
'UNDEFINED'
>>> 

What do you see instead?
UNDEFINED

What version of diStorm are you using? On what platform (Python/EXE/other)?
last source

Please provide any additional information below.

Original issue reported on code.google.com by felipe.a...@gmail.com on 27 Feb 2015 at 11:28

skrasser commented 8 years ago

I am seeing the same using distorm3 version 3.3.0 from PyPI. Below is a quick repro (note how the full instruction object i still serializes correctly as string, just the i.mnemonic member variable is impacted):

Python 2.7.11 (default, Dec  5 2015, 23:52:42)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import distorm3
>>> code = b'\x48\x63\xc8'
>>> for i in distorm3.DecomposeGenerator(0x1000, code, distorm3.Decode64Bits):
...     print "%s/%s --> %s" % (i.mnemonic, hex(i.opcode), i)
...
UNDEFINED/0x2715 --> MOVSXD RCX, EAX

Looking at the Mnemonics dict in the Python bindings, 0x2715 is indeed missing (MOVSXD is listed under key 0x271d though).

Looking at this Java example, 0x2715 is listed as MOVSXD there while 0x271d is PAUSE (which has yet another key in the Python version).

From mnemonics.h, it looks like the Java code is correct:

Are these typos in the Python version or is this potentially based on an older version of the enum in the C code?

gdabah commented 8 years ago

I sync'ed the tables again. Latest revision should resolve it. Please confirm.

skrasser commented 8 years ago

Not seeing a commit -- did you push? Also, can you release a new version to PyPI by any chance? Thank you!

gdabah commented 8 years ago

Please retry. I will later this weekend.

skrasser commented 8 years ago

Still problems unfortunately. The fix appears to break the instruction output (but the mnemonic output now works):

>>> import distorm3
>>> code = b'\x48\x63\xc8'
>>> for i in distorm3.DecomposeGenerator(0x1000, code, distorm3.Decode64Bits):
...     print "%s/%s --> %s" % (i.mnemonic, hex(i.opcode), i)
...
MOVSXD/0x271b -->  RCX, EAX

Expected output:

MOVSXD/0x271b -->  MOVSXD RCX, EAX

Here's what I see in the code:

It looks like mnemonics.h needs to be updated, too.

skrasser commented 8 years ago

Update: now works as expected after commit ac277fb -- thank you!

>>> import distorm3
>>> code = b'\x48\x63\xc8'
>>> for i in distorm3.DecomposeGenerator(0x1000, code, distorm3.Decode64Bits):
...     print "%s/%s --> %s" % (i.mnemonic, hex(i.opcode), i)
...
MOVSXD/0x272b --> MOVSXD RCX, EAX
gdabah commented 8 years ago

Yey :)

skrasser commented 8 years ago

Thanks again -- also a PyPI release of the latest fixes would be much appreciated :)