[instr logging] Log the faulting instruction before fault side-effects

arichardson commented 2 years ago

Previously, we would not log any exception output if we get stuck in an infinite loop due to an ifetch fault after an exception because the log_instr_commit only happens when we execute an instruction.

To work around this, we can call qemu_log_instr_commit() before handling the exception/interrupt and inject a fake instruction to the log buffer (with size==0 and address -1). A nice side-effect of this change is that the faulting exception is now decoded according to the mode that the CPU was in before installing the trap vector. Previously a trap from capmode to a non-capmode handler would disassemble a capability-relative load as the integer one since the exception handler PCC was already installed when the logging performs the disassembly.

Instead of no output, we now print an infinite stream of the following if we get a trap while trying to jump to the exception handler:

[0:0] 0x0000000080000078:  0002c023          csc             cnull,0(ct0)
[0:0] logging exception side effects
-> Switch to Machine mode
-> Exception https://github.com/CTSRD-CHERI/qemu/issues/6 vector 0x0000000000000000 fault-addr 0x0000000080065aa8
    Write mstatus = 0000000a00001800
    Write mcause = 0000000000000006
    Write MEPCC|v:1 s:0 p:00078fff f:1 b:0000000000000000 l:ffffffffffffffff
             |o:0000000080000078 t:000000000003ffff
    Write mbadaddr = 0000000080065aa8
    Write mtval2 = 0000000000000000
    Write PCC|v:1 s:0 p:00078fff f:0 b:0000000000000000 l:ffffffffffffffff
             |o:0000000000000000 t:000000000003ffff
[0:0] logging exception side effects
-> Switch to Machine mode
-> Exception https://github.com/CTSRD-CHERI/qemu/pull/1 vector 0x0000000000000000 fault-addr 0x0000000000000000
    Write mstatus = 0000000a00001800
    Write mcause = 0000000000000001
    Write MEPCC|v:1 s:0 p:00078fff f:0 b:0000000000000000 l:ffffffffffffffff
             |o:0000000000000000 t:000000000003ffff
    Write mbadaddr = 0000000000000000
    Write mtval2 = 0000000000000000
    Write PCC|v:1 s:0 p:00078fff f:0 b:0000000000000000 l:ffffffffffffffff
             |o:0000000000000000 t:000000000003ffff
[0:0] logging exception side effects
-> Switch to Machine mode
-> Exception https://github.com/CTSRD-CHERI/qemu/pull/1 vector 0x0000000000000000 fault-addr 0x0000000000000000
    Write mstatus = 0000000a00001800
    Write mcause = 0000000000000001
    Write MEPCC|v:1 s:0 p:00078fff f:0 b:0000000000000000 l:ffffffffffffffff
             |o:0000000000000000 t:000000000003ffff
    Write mbadaddr = 0000000000000000
    Write mtval2 = 0000000000000000
    Write PCC|v:1 s:0 p:00078fff f:0 b:0000000000000000 l:ffffffffffffffff
             |o:0000000000000000 t:000000000003ffff

jrtc27 commented 2 years ago

From memory this is all rather hairy and it’s easy to end up with duplicate (or a second garbage) instruction if not careful. How extensively have you tested this in edge cases?

arichardson commented 2 years ago

From memory this is all rather hairy and it’s easy to end up with duplicate (or a second garbage) instruction if not careful. How extensively have you tested this in edge cases?

It's been a while (just found those commits in a worktree), but I think it's probably better to have duplicate output in these weird infinite loop corner cases rather than no output at all?

arichardson commented 2 years ago

From memory this is all rather hairy and it’s easy to end up with duplicate (or a second garbage) instruction if not careful. How extensively have you tested this in edge cases?

It's been a while (just found those commits in a worktree), but I think it's probably better to have duplicate output in these weird infinite loop corner cases rather than no output at all?

The case I tested was mostly trap where the trap handler cannot be invoked due to an invalid mtcc and the fault on ifetch case.

jrtc27 commented 2 years ago

If they're old commits are you sure they're still needed after my recent change? I would expect https://github.com/CTSRD-CHERI/qemu/commit/d6653f7e00763097f59a6149cca11aeae0e0a4e4 to handle that correctly...

jrtc27 commented 2 years ago

(Note that only dev has that, not qemu-cheri)

arichardson commented 2 years ago

Those do indeed look related, but I think it won't fix the ifetch MMU translate fault case? I'm also pretty sure that adding the fake instruction was rather helpful to avoid duplicates.

Will re-test and see if this is still necessary.

jrtc27 commented 2 years ago

The MMU translate comes after the PCC checks, which themselves come after the commit I added (well hoisted)

arichardson commented 2 years ago

The MMU translate comes after the PCC checks, which themselves come after the commit I added (well hoisted)

I meant the case where the exception handler target is an invalid address (but within cap bounds), so you get an error from the translator. But this has been more than 6 months so I could be misremembering.

jrtc27 commented 2 years ago

The MMU translate comes after the PCC checks, which themselves come after the commit I added (well hoisted)

I meant the case where the exception handler target is an invalid address (but within cap bounds), so you get an error from the translator. But this has been more than 6 months so I could be misremembering.

That should be fine, it still gets a new TB and starts translating, just the insn_start generates code to fault, IIRC

qwattash commented 2 years ago

I'm not entirely sure about the splitting the side effects into a separate "fake" instruction. Can't we just run the exception handler and then commit() afterwards? I also have some experimental tracing backends that assume that there are no fake instructions. I can fix them up if it is a strong requirement though, as there are currently no downstream consumers that inspect exceptions anyway. Perhaps another option could be to add the exception side-effects into the event list I have associated with trace entries in the experimental tracing branch.

CTSRD-CHERI / qemu

[instr logging] Log the faulting instruction before fault side-effects #204