OpenXiangShan / NEMU

Other
236 stars 90 forks source link

difftest-as-dut: allow ref model to advance one more step on mismatch #407

Closed shinezyy closed 3 months ago

shinezyy commented 3 months ago
shinezyy commented 3 months ago

Current implementation will corrupt the site of CPU_state during second checking

shinezyy commented 3 months ago

Current implementation will corrupt the site of CPU_state during second checking

resolved

shinezyy commented 3 months ago

@cebarobot @poemonsense

Can we do regression on difftest for bugs detection? Because current CI only ensure `` no difference ''.

If I mistakenly modify NEMU to never report difference in difftest, it will also pass current CI.

cebarobot commented 3 months ago

I think that is useful, but how to do that?

poemonsense commented 3 months ago

I know few about NEMU as DUT.

Generally, we need a failure testcase to test the functionality?

shinezyy commented 3 months ago

I know few about NEMU as DUT.

Generally, we need a failure testcase to test the functionality?

Yes, for example, use a buggy Spike as ref, and check output.

I am thinking about this because I am afraid that this patch (or similar patches) silently breaks NEMU as DUT (never report difference) but passes CI

poemonsense commented 3 months ago

I know few about NEMU as DUT. Generally, we need a failure testcase to test the functionality?

Yes, for example, use a buggy Spike as ref, and check output.

I am thinking about this because I am afraid that this patch (or similar patches) silently breaks NEMU as DUT (never report difference) but passes CI

This feature looks important. The difftest repo also lacks this CI feature.

1) checking the failure can be done by checking the return code of a command. GitHub CI is internally a bash script. I think it can be done by like https://stackoverflow.com/questions/26675681/how-to-check-the-exit-status-using-an-if-statement I never tried this but it probably works.

false || exit_code=$?
if [[ ${exit_code} -ne 0 ]]; then echo ${exit_code}; fi

2) To inject a bug into Spike/NEMU, maybe we can add a probablistic return code of nonzero for the isa_difftest_checkregs function?

poemonsense commented 3 months ago

I know few about NEMU as DUT. Generally, we need a failure testcase to test the functionality?

Yes, for example, use a buggy Spike as ref, and check output. I am thinking about this because I am afraid that this patch (or similar patches) silently breaks NEMU as DUT (never report difference) but passes CI

This feature looks important. The difftest repo also lacks this CI feature.

  1. checking the failure can be done by checking the return code of a command. GitHub CI is internally a bash script. I think it can be done by like https://stackoverflow.com/questions/26675681/how-to-check-the-exit-status-using-an-if-statement I never tried this but it probably works.
false || exit_code=$?
if [[ ${exit_code} -ne 0 ]]; then echo ${exit_code}; fi
  1. To inject a bug into Spike/NEMU, maybe we can add a probablistic return code of nonzero for the isa_difftest_checkregs function?

I can have a try for these in the difftest repo as well. We may need to add the similar feature to other repos. It's really important

shinezyy commented 3 months ago

Because my CI guard for NEMU as DUT depends on misa.rvb, I will add it in RVB patch

NewPaulWalker commented 3 months ago
  • When trapping on exception, occasionally, NEMU has executed csrrw tp,mscratch,tp, but Spike has not. After advancing Spike for one more step, Spike's architectural state will match with NEMU again.

This is actually because when certain exceptions occur, NEMU does not let Spike continue executing the instruction that caused the exception. As a result, Spike falls one instruction behind NEMU, leading to a mismatch. After letting Spike execute one more instruction, they match again. I think all exceptions (excluding interrupts) are caused by the instructions themselves, so I decided that whenever an exception occurs, NEMU should let Spike execute one instruction, which is the instruction that cause the exception. https://github.com/OpenXiangShan/NEMU/blob/master/src/isa/riscv64/system/intr.c#L71

poemonsense commented 3 months ago

I think all exceptions (excluding interrupts) are caused by the instructions themselves, so I decided that whenever an exception occurs, NEMU should let Spike execute one instruction, which is the instruction that cause the exception.

This is how difftest should work. Upon exception/interrupt, should tick one step.