Open Inf0Junki3 opened 4 years ago
keystone is working fine. The tests in the OP are simply incorrect in how they construct bytestrings:
test2_nok = "".join(map(lambda x: chr(x), ks.asm("sub eax, 0x80808080")[0])).encode()
Here,
ks.asm
correctly returns[0x2d, 0x80, 0x80, 0x80, 0x80]
. Then usingchr
and"".join()
, the above snippet interprets this as a list of unicode codepoints. I.e. it constructs the string'\u002d\u0080\u0080\u0080\u0080'
and then encodes it into UTF-8. All unicode codepoints above 0x7f are multibyte sequences in UTF-8; this is where the 0xc2 bytes come from.>> '\u002d\u0080\u0080\u0080\u0080'.encode('utf-8') b'-\xc2\x80\xc2\x80\xc2\x80\xc2\x80'
The correct way to convert a list of integers into a
bytes
object is to use thebytes
constructor:>> bytes(ks.asm("sub eax, 0x80808080")[0]) b'-\x80\x80\x80\x80'
Hi,
I'm experiencing a strange issue when assembling SUB operations. If I try to subtract any byte value of 0x80 or over, keystone inserts an extra 0xc2 in the bytecode. I've set up a quick test here:
I've also noticed that if I try to
sub ebx, 0x7f7f7f7f
, the bytecode also has a 0xc2 in it -- this seems to be because the bytecode would legitimately be0x81eb7f7f7f7f
I believe. I've tested what I've found here with a few other assemblers.The version of keystone engine I have installed is:
Hope this helps! And kudos for an awesome library.
ADDENDUM:
Looking into this a bit more, it seems that the source of the issue is with the python bindings. This is the output from the kstool: