capstone-engine / capstone

Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
http://www.capstone-engine.org
7.6k stars 1.56k forks source link

PowerPC: branches and conditionals do not agree on implicit operands/registers #2044

Open Kaltxi opened 1 year ago

Kaltxi commented 1 year ago

Hello! It seems that in PPC the branch instructions incorrectly determine implicit register accesses, while some conditional instructions do not determine them at all. I expect branches and conditionals to be in agreement - conditionals write to conditional register or CTR (when applicable), while branches read corresponding register.

Here are some examples:

beq

Currently:

Reads: [CTR, RM]
Writes: [CTR]
Operands: [IMM]

Expected:

Reads: [CR0]
Writes: []
Operands: [IMM]

Not sure what is RM register and whether it is warranted here.

beq

blt (explicit)

Currently:

Reads: [CTR, RM]
Writes: [CTR]
Operands: [CR7, IMM]

Expected:

Reads: []
Writes: []
Operands: [CR7, IMM]

Here CR7 is correctly determined as it is an explicit operand, but the incorrect implicits still present.

blt-expl

bdnz

Currently and expected:

Reads: [CTR]
Writes: [CTR]
Operands: [IMM]

Here implicit CTR read and write is correct, but there is a different problem of bdnz not being identified as conditional branch, and not given bc value.

bdnz

cmpwi

Currently:

Reads: []
Writes: []
Operands: [REG, IMM]

Expected:

Reads: []
Writes: [CR0]
Operands: [REG, IMM]

cmpwi

rlwinm.

Currently and expected:

Reads: []
Writes: [CR0]
Operands: [REG, REG, IMM, IMM, IMM]

_Here CR0 is correctly determined for instruction with dot suffix in addition to update_cr0 flag being set._

rlwinm

fcmpu (explicit)

Currently and expected:

Reads: []
Writes: []
Operands: [CR7, REG, REG]

Here CR7 is correctly determined as it is an explicit operand.

fcmpu

P.S. There is also another related issue of operands not supporting access flags, like in ARM for example, only implicit accesses are determined. But I suppose that feature is simply unsupported for PPC at the moment.

Rot127 commented 1 year ago

Thanks for the detailed report. All of these problems (incorrect implicit registers, incorrect operands, operand access information and incorrect instruction groups) are likely all solved by https://github.com/capstone-engine/capstone/pull/2013.

https://github.com/capstone-engine/capstone/pull/2013 will also give access to the branch condition information in way more detail (access to the bi and bo fields are added).

Could you please provide instruction bytes for each example? I'd like to add them as test cases.

Kaltxi commented 1 year ago

Ok, thanks for the info, will be looking forward to it! Here are some bytes (some are different from the above provided examples, can't find them again sorry, but the the output for operands/registers is the same): beq 0xfff0127c: 0x41 0x82 0x00 0x3c blt cr7, 0xff71e3f4: 0x41 0x9c 0x00 0x08 bdnz 0xffe00028: 0x42 0x00 0xff 0x5c cmpwi r3, 0: 0x2c 0x03 0x00 0x00 rlwinm. r3, r1, 0x1e, 0, 1: 0x54 0x23 0xf0 0x03 fcmpu cr7, f1, f4: 0xff 0x81 0x20 0x00

Also, not sure if this case was mentioned, but there are instructions which have double branch condition, such as bdnzt 4*crX+bit, <address> - branch happens when both CTR is not zero after decrement and condition register indicates truth: PPC_BC_NZ && PPC_BC_bit.

Rot127 commented 1 year ago

These are the results for the branch instructions you gave me (registers are still broken, please ignore this part):

 0  41 82 00 3c     bt  eq, 0x3c
    ID: 1681 (xxspltw)
    op_count: 2
        operands[0].type: REG = 2
        operands[0].access: READ
        operands[1].type: IMM = 0x3c
        operands[1].access: READ
    Branch:
        crX: cr0
        bi: 2
        bo: 12
        bh: 0
        pred CR-bit: eq
        hint: 0
    Groups: jump branch_relative 

 4  41 9c 00 08     bt  4*cr7+lt, 0xc
    ID: 1681 (xxspltw)
    op_count: 2
        operands[0].type: REG = 28
        operands[0].access: READ
        operands[1].type: IMM = 0xc
        operands[1].access: READ
    Branch:
        crX: cr7
        bi: 0
        bo: 12
        bh: 0
        pred CR-bit: lt
        hint: 0
    Groups: jump branch_relative 

 8  42 00 ff 5c  bdnz   
    ID: 78 (bctrl)
    Branch:
        crX: cr0
        bi: 0
        bo: 16
        bh: 0
        pred CTR: nz
        hint: 0
    Groups: jump branch_relative 

And the addtional case with two predicates:

 0  41 1a 01 00     bdnzt   4*cr6+eq, 0x100
    ID: 1681 (xxspltw)
    op_count: 2
        operands[0].type: REG = 26
        operands[0].access: READ
        operands[1].type: IMM = 0x100
        operands[1].access: READ
    Branch:
        crX: cr6
        bi: 2
        bo: 8
        bh: 0
        pred CR-bit: eq
        pred CTR: nz
        hint: 0
    Groups: jump branch_relative 

Could you double check that the branch conditions are fine.

Kaltxi commented 1 year ago

This looks pretty nice, conditions look correct, the only things are: bdnz has cr0 mentioned, but doesn't touch cr registers, and there aren't implicit operands, that said info in branch is enough to get the full picture - cr bit is absent and by ctr predicate we see that ctr is being read.

Rot127 commented 1 year ago

bdnz has cr0 mentioned

Right. There is an if clause missing. Thanks.

I'll implement the implicit registers later. The merge of the PR is planned around mid-end of July. Just so you know.

Kaltxi commented 1 year ago

Awesome, thank you for your work!