capstone-engine / capstone

Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
http://www.capstone-engine.org
7.52k stars 1.55k forks source link

[python, x86] Mismatch between operand type and type constant #2274

Open cristianassaiante opened 7 months ago

cristianassaiante commented 7 months ago

Hi,

I just found out that starting from version 5.0.1, the value of CS_OP_MEM (accessed from the python library) has changed, from 3 to 128, potentially causing issues in automated tools that make use of such constant value to check the type of the operands of an instruction.

Below a python run that shows the difference, using as example the x86 instruction mov eax, dword ptr [rsp + 0x14]:

Capstone version 5.0.0

>>> import capstone as cs
>>> md = cs.Cs(cs.CS_ARCH_X86, cs.CS_MODE_64)
>>> md.detail = True
>>> md.disasm("\x8b\x44\x24\x14", 0)
>>> ins = md.disasm(b"\x8b\x44\x24\x14", 0).__next__()
>>> ins
<CsInsn 0x0 [8b442414]: mov eax, dword ptr [rsp + 0x14]>
>>> ins.operands[1].type
3
>>> cs.CS_OP_MEM
3

Capstone version 5.0.1

>>> import capstone as cs
>>> cs.debug()
'python-standard-arm-arm64-bpf-evm-m680x-m68k-mips-mos65xx-ppc-riscv-sparc-sysz-tms320c64x-xcore-x86-c5.0-b5.0'
>>> md.disasm("\x8b\x44\x24\x14", 0)
>>> md.detail = True
>>> ins = md.disasm(b"\x8b\x44\x24\x14", 0).__next__()
>>> ins
<CsInsn 0x0 [8b442414]: mov eax, dword ptr [rsp + 0x14]>
>>> ins.operands[1].type
3
>>> cs.CS_OP_MEM
128

So my question is: is this an issue or is there another constant available to check the operand type?

Thanks in advance for the support.

Rot127 commented 7 months ago

Thanks for pointing this out. Turns out this was accidentally pulled in, because we wanted to have the TriCore architecture being part of v5. TriCore used the new auto-sync updater (will be released with v6) for creation. And architectures using it require to distinguish registers/immediate values used for memory and non memory registers/immediate values.

TriCore though probably doesn't require the value change? @imbillow

Here is the PR reverting it: https://github.com/capstone-engine/capstone/pull/2275

I propose to change it back for v5.0.2? It's annoying to have this inconsistency between those versions. But IMHO it is still better, because people can get a clear pre-AutoSync version and a post-AutoSync version of Capstone. cc @kabeor @XVilka (candidate for https://github.com/capstone-engine/capstone/issues/2081)

XVilka commented 7 months ago

See also https://github.com/capstone-engine/capstone/issues/2144