capstone-engine / capstone

Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
http://www.capstone-engine.org
7.27k stars 1.53k forks source link

[TMS320C64X] whitespace in mnemonic #1418

Closed tmfink closed 5 years ago

tmfink commented 5 years ago

This bug is easy to see with a modification to test_tms320c64x.c:

diff --git a/tests/test_tms320c64x.c b/tests/test_tms320c64x.c
index 86a6d76d..6f725c26 100644
--- a/tests/test_tms320c64x.c
+++ b/tests/test_tms320c64x.c
@@ -166,6 +166,7 @@ static void test()

                        for (j = 0; j < count; j++) {
                                printf("0x%"PRIx64":\t%s\t%s\n", insn[j].address, insn[j].mnemonic, insn[j].op_str);
+                               printf("'%s'\n", insn[j].mnemonic);
                                print_insn_detail(&insn[j]);
                        }
                        printf("0x%"PRIx64":\n", insn[j-1].address + insn[j-1].size);
****************
Platform: TMS320C64x
Code:0x01 0xac 0x88 0x40 0x81 0xac 0x88 0x43 0x00 0x00 0x00 0x00 0x02 0x90 0x32 0x96 0x02 0x80 0x46 0x9e 0x05 0x3c 0x83 0xe6 0x0b 0x0c 0x8b 0x24 
Disasm:
0x1000:       add.D1    a11, a4, a3
'      add.D1'
        op_count: 3
                operands[0].type: REG = a11
                operands[1].type: REG = a4
                operands[2].type: REG = a3
        Functional unit: D1
        Parallel: false
...

This was discovered while working on the Rust language bindings https://github.com/capstone-rust/capstone-rs/pull/60

aquynh commented 5 years ago

looks like an extra tab, can you send a pull req?

@fotisl please ack.

tmfink commented 5 years ago

Also, looks like insn.mnemonic and cs_insn_name(handle, insn.id) give different output.

diff --git a/tests/test_tms320c64x.c b/tests/test_tms320c64x.c
index 86a6d76d..c0a7eeff 100644
--- a/tests/test_tms320c64x.c
+++ b/tests/test_tms320c64x.c
@@ -166,6 +166,8 @@ static void test()

                        for (j = 0; j < count; j++) {
                                printf("0x%"PRIx64":\t%s\t%s\n", insn[j].address, insn[j].mnemonic, insn[j].op_str);
+                               printf("\tmnemonic='%s'\n", insn[j].mnemonic);
+                               printf("\tcs_insn_name='%s'\n", cs_insn_name(handle, insn[j].id));
                                print_insn_detail(&insn[j]);
                        }
                        printf("0x%"PRIx64":\n", insn[j-1].address + insn[j-1].size);
$ ./tests/test_tms320c64x.static 
****************
Platform: TMS320C64x
Code:0x01 0xac 0x88 0x40 0x81 0xac 0x88 0x43 0x00 0x00 0x00 0x00 0x02 0x90 0x32 0x96 0x02 0x80 0x46 0x9e 0x05 0x3c 0x83 0xe6 0x0b 0x0c 0x8b 0x24 
Disasm:
0x1000:       add.D1    a11, a4, a3
        mnemonic='      add.D1'
        cs_insn_name='add'
        op_count: 3
                operands[0].type: REG = a11
                operands[1].type: REG = a4
                operands[2].type: REG = a3
        Functional unit: D1
        Parallel: false

0x1004: [ a1] add.D2    b11, b4, b3     ||
        mnemonic='[ a1] add.D2'
        cs_insn_name='add'
        op_count: 3
                operands[0].type: REG = b11
                operands[1].type: REG = b4
                operands[2].type: REG = b3
        Functional unit: D2
        Condition: [ a1]
        Parallel: true

0x1008:       NOP       
        mnemonic='      NOP'
        cs_insn_name='nop'
        Functional unit: No Functional Unit
        Parallel: false

0x100c:       ldbu.D1T2 *++a4[1], b5
        mnemonic='      ldbu.D1T2'
        cs_insn_name='ldbu'
        op_count: 2
                operands[0].type: MEM
                        operands[0].mem.base: REG = a4
                        operands[0].mem.disptype: Constant
                        operands[0].mem.disp: 1
                        operands[0].mem.unit: 2
                        operands[0].mem.direction: Forward
                        operands[0].mem.modify: Pre
                        operands[0].mem.scaled: 1
                operands[1].type: REG = b5
        Functional unit: D2
        Parallel: false

0x1010:       ldbu.D2T2 *+b15[0x46], b5
        mnemonic='      ldbu.D2T2'
        cs_insn_name='ldbu'
        op_count: 2
                operands[0].type: MEM
                        operands[0].mem.base: REG = b15
                        operands[0].mem.disptype: Constant
                        operands[0].mem.disp: 70
                        operands[0].mem.unit: 2
                        operands[0].mem.direction: Forward
                        operands[0].mem.modify: No
                        operands[0].mem.scaled: 0
                operands[1].type: REG = b5
        Functional unit: D2
        Parallel: false

0x1014:       lddw.D1T2 *+a15[4], b11:b10
        mnemonic='      lddw.D1T2'
        cs_insn_name='lddw'
        op_count: 2
                operands[0].type: MEM
                        operands[0].mem.base: REG = a15
                        operands[0].mem.disptype: Constant
                        operands[0].mem.disp: 4
                        operands[0].mem.unit: 2
                        operands[0].mem.direction: Forward
                        operands[0].mem.modify: No
                        operands[0].mem.scaled: 1
                operands[1].type: REGPAIR = b11:b10
        Functional unit: D2
        Parallel: false

0x1018:       ldndw.D1T1        *+a3(a4), a23:a22
        mnemonic='      ldndw.D1T1'
        cs_insn_name='ldndw'
        op_count: 2
                operands[0].type: MEM
                        operands[0].mem.base: REG = a3
                        operands[0].mem.disptype: Register
                        operands[0].mem.disp: a4
                        operands[0].mem.unit: 1
                        operands[0].mem.direction: Forward
                        operands[0].mem.modify: No
                        operands[0].mem.scaled: 0
                operands[1].type: REGPAIR = a23:a22
        Functional unit: D1
        Parallel: false

0x101c:
aquynh commented 5 years ago

@fotisl please ack.

fotisl commented 5 years ago

Note that at the first instruction, the instruction name is indeed add, but the mnemonic is add.D1 since the mnemonic is the instruction name plus the functional unit where the instruction is executed. This is not a bug, but the way this architecture works. As far as the spaces are concerned, they are included in order to accommodate for the conditional instructions. Indentation is included in the original documentation in order to have an aligned assembly code when using commands like "[ a1] add.D2" and " add.D2". Removal of whitespace is a matter of whether capstone wants plain output, or output formatted per processor manual.