riscv-software-src / riscv-isa-sim

Spike, a RISC-V ISA Simulator
Other
2.36k stars 826 forks source link

HLIF gets desynchronized on RV32 targets #1293

Open xobs opened 1 year ago

xobs commented 1 year ago

HLIF is a 64-bit communication channel that exists at tohost and fromhost. This channel is used for early boot, syscalls, and for shutting down the simulator.

Operations are performed by writing 64-bit values to the tohost address. For example, writing 0x0101_0000_0000_0041 will send the character a to the serial port.

The simulator operates by executing 5000 instructions at a time and then checking whether the address at tohost is zero. If it is nonzero, it captures the value, zeroes out the tohost value, and processes the command:

https://github.com/riscv-software-src/riscv-isa-sim/blob/8983efd14694e57121d6855283432f7a5b775b50/fesvr/htif.cc#L265-L266

On RV64 platforms, this is an atomic operation. However, on RV32 platforms, this is non-atomic and the INTERLEAVE operation can expire after the lower sw instruction but before the upper sw instruction.

A workaround I've implemented is to check that the upper bits are set or the lower bits are 1 (indicating exit):

if ((tohost = from_target(mem.read_uint64(tohost_addr))) != 0) {
  if ((tohost == 1) || ((tohost & 0xffffffff00000000) != 0)) {
    mem.write_uint64(tohost_addr, target_endian<uint64_t>::zero);
  } else {
    tohost = 0;
  }
}
xobs commented 1 year ago

An example of a crash that occurs without this patch:

16-bit Opcode at PC 800006a2 d62a -> 16-bit opcode at PC 800006a4: d82e
16-bit Opcode at PC 800006a4 d82e -> 16-bit opcode at PC 800006a6: da2a
16-bit Opcode at PC 800006a6 da2a -> 16-bit opcode at PC 800006a8: dc2e
16-bit Opcode at PC 800006a8 dc2e -> 32-bit opcode at PC 800006aa: 00b62023
32-bit Opcode at PC 800006aa 00b62023 -> 16-bit opcode at PC 800006ae: de2a
Access exception occurred while host was accessing memory on behalf of target (tohost = 0x46):
Memory address 0x40 is invalid
$