`0x00000000` decodes to `mv s0, sp`

palmer-dabbelt commented 6 years ago

I'm executing some invalid code and I see

IN: 
0x20400000:  0000              mv              s0,sp

QEMU does the right thing here (decoding it to a jump to the trap vector), but the disassembly is wrong.

michaeljclark commented 6 years ago

On 12/06/2018, at 8:58 PM, Palmer Dabbelt notifications@github.com wrote:

I'm executing some invalid code and I see

IN: 0x20400000: 0000 mv s0,sp QEMU does the right thing here (decoding it to a jump to the trap vector), but the disassembly is wrong.

That’s an interesting bug. unimp is 0x0000 and c.addi4spn is bits[15:13] == 0b000 && bits[0:1] == 0b00, so gets selected based on the mask bits we derived from riscv-opcodes (c.addi4spn 1..0=0 15..13=0). Given c.addispn4 expands to an add with a compressed register of 0 which is rd (x8=s0) and implicit sp as rs1 and zero for rs2, the instruction is lifted into a MV pseudo using the move pattern match. Very intresting. All zeros. I must have missed a non zero constraint.

The problem is the opcode decoder uses the op field mask and decodes to a valid compressed instruction, we only use positive pattern matches to transform. It’s a bug in the design of the disassembler.

I can patch this easily by changing the code but I will have to think about how to fix the disassembler generator, as this was machine generated code. I have negative constraint matching in the compression patch but not in the decompression path.

Again, if the op fields match, we get an opcode. The easy fix is to check for 0x0000 and select unimp but this may mask other subtle bugs related to the various compression negative constraints on operands fields after the opcode has been decoded.

Good catch!

michaeljclark commented 6 years ago

On 12/06/2018, at 8:58 PM, Palmer Dabbelt notifications@github.com wrote:

I'm executing some invalid code and I see

IN: 0x20400000: 0000 mv s0,sp QEMU does the right thing here (decoding it to a jump to the trap vector), but the disassembly is wrong.

Makes me wonder about the class of bugs. The Base ISA path doesn’t have any constraints on fields so assuming the opcode metadata is correct, then we will fall though to illegal instruction for anything that doesn’t match.

This would be the case if RVC was disabled (the riscv-qemu disassembler was derived from the rv8 decoder which can be parametrised by extension).

However we used the same design for the compressed decoder i.e. select compressed micro-op from the opcode metadata derived mask which is 15..13=000 and 1..0=0

In fact we could search the constraint space for reserved encodings using the constraint metadata, which we use for compression, and which I only recently fixed when I realised I’d missed some negative constraints in the spec.

So c.addispn, s0, sp, 0 must be illegal i.e. imm != 0. So we need to add a pass that runs the compression constraints after decode for negative matches on operand fields. We have the metadata to fix this generally... just we are only checking these constraints on compression not expansion.

riscvarchive / riscv-qemu

`0x00000000` decodes to `mv s0, sp` #147