lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.57k stars 770 forks source link

[otbn, dv] OTBN regression failure triage with CDC instrumentation #19045

Open GregAC opened 1 year ago

GregAC commented 1 year ago

I've done a triage pass of failures seen in the OTBN regression when CDC instrumentation is enabled. To do this I compared the failures with the instrumentation on against the failures from the latest nightly (which has instrumentation off) and looked at those signatures that were only seen in tests with the instrumentation on. I also looked at tests that had significantly higher failure rates with instrumentation on but were failing with signatures seen in those tests with instrumentation off.

A full nightly was run against commit d941d7e8b7d25a30b41c714c4c448c87a2853551 modified to enable CDC instrumentation for OTBN by modifying:

https://github.com/lowRISC/opentitan/blob/3db2d45d97475e28e63ecedbe824a0eee108f523/hw/ip/otbn/dv/uvm/otbn_sim_cfg.hjson#L74

All bar one failure are verification environment issues that aren't a concern for M2.5.2. One failure could a be serious CDC issue (https://github.com/lowRISC/opentitan/issues/19044).

Other than #19044 the failures here can be addressed in V3.

Full details of failures triaged and conclusions are below:

New signatures from CDC enablement:

UVM_FATAL (otbn_model_if.sv:77) [otbn_model_if] Check failed (u_model.otbn_model_step_crc(handle, item, crc_state) == *) Failed to update CRC has 2 failures:

Test otbn_escalate has 2 failures.
32.otbn_escalate.4215586503
Line 113, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/32.otbn_escalate/latest/run.log

  UVM_FATAL @   2373129 ps: (otbn_model_if.sv:77) [otbn_model_if] Check failed (u_model.otbn_model_step_crc(handle, item, crc_state) == 0) Failed to update CRC
  UVM_INFO @   2373129 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
36.otbn_escalate.2699923888
Line 113, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/36.otbn_escalate/latest/run.log

  UVM_FATAL @   2060914 ps: (otbn_model_if.sv:77) [otbn_model_if] Check failed (u_model.otbn_model_step_crc(handle, item, crc_state) == 0) Failed to update CRC
  UVM_INFO @   2060914 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_base_vseq.sv:582) [otbn_sw_errs_fatal_chk_vseq] Check failed (!(cfg.loop_vif.loop_stack_push || cfg.loop_vif.loop_stack_pop) && !cfg.trace_vif.locking_o) has 1 failures:

Test otbn_sw_errs_fatal_chk has 1 failures.
0.otbn_sw_errs_fatal_chk.1801271138
Line 123, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/0.otbn_sw_errs_fatal_chk/latest/run.log

  UVM_FATAL @  88113688 ps: (otbn_base_vseq.sv:582) [uvm_test_top.env.virtual_sequencer.otbn_sw_errs_fatal_chk_vseq] Check failed (!(cfg.loop_vif.loop_stack_push || cfg.loop_vif.loop_stack_pop) &&                       !cfg.trace_vif.locking_o)
  UVM_INFO @  88113688 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_base_vseq.sv:582) [otbn_sec_wipe_err_vseq] Check failed (!(cfg.loop_vif.loop_stack_push || cfg.loop_vif.loop_stack_pop) && !cfg.trace_vif.locking_o) has 1 failures:

Test otbn_sec_wipe_err has 1 failures.
1.otbn_sec_wipe_err.1382000636
Line 93, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/1.otbn_sec_wipe_err/latest/run.log

  UVM_FATAL @  25973747 ps: (otbn_base_vseq.sv:582) [uvm_test_top.env.virtual_sequencer.otbn_sec_wipe_err_vseq] Check failed (!(cfg.loop_vif.loop_stack_push || cfg.loop_vif.loop_stack_pop) &&                       !cfg.trace_vif.locking_o)
  UVM_INFO @  25973747 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_base_vseq.sv:369) [otbn_urnd_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution has 1 failures:

Test otbn_urnd_err has 1 failures.
1.otbn_urnd_err.2713703608
Line 100, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/1.otbn_urnd_err/latest/run.log

  UVM_FATAL @  17274033 ps: (otbn_base_vseq.sv:369) [uvm_test_top.env.virtual_sequencer.otbn_urnd_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution
  UVM_INFO @  17274033 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_stack_addr_integ_chk_vseq.sv:72) [otbn_stack_addr_integ_chk_vseq] timeout occurred! has 1 failures:

Test otbn_stack_addr_integ_chk has 1 failures.
1.otbn_stack_addr_integ_chk.1064076725
Line 93, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/1.otbn_stack_addr_integ_chk/latest/run.log

  UVM_FATAL @ 10008294145 ps: (otbn_stack_addr_integ_chk_vseq.sv:72) [uvm_test_top.env.virtual_sequencer.otbn_stack_addr_integ_chk_vseq] timeout occurred!
  UVM_INFO @ 10008294145 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_base_vseq.sv:369) [otbn_alu_bignum_mod_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution has 1 failures:

Test otbn_alu_bignum_mod_err has 1 failures.
2.otbn_alu_bignum_mod_err.2866351614
Line 100, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/2.otbn_alu_bignum_mod_err/latest/run.log

  UVM_FATAL @   9988012 ps: (otbn_base_vseq.sv:369) [uvm_test_top.env.virtual_sequencer.otbn_alu_bignum_mod_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution
  UVM_INFO @   9988012 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_stack_addr_integ_chk_vseq.sv:58) [otbn_stack_addr_integ_chk_vseq] timeout occurred! has 1 failures:

Test otbn_stack_addr_integ_chk has 1 failures.
3.otbn_stack_addr_integ_chk.182425094
Line 93, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/3.otbn_stack_addr_integ_chk/latest/run.log

  UVM_FATAL @ 10014806449 ps: (otbn_stack_addr_integ_chk_vseq.sv:58) [uvm_test_top.env.virtual_sequencer.otbn_stack_addr_integ_chk_vseq] timeout occurred!
  UVM_INFO @ 10014806449 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_base_vseq.sv:348) [otbn_imem_err_vseq] Check failed (!cfg.under_reset) has 1 failures:

Test otbn_stress_all_with_rand_reset has 1 failures.
5.otbn_stress_all_with_rand_reset.1133880645
Line 135, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/5.otbn_stress_all_with_rand_reset/latest/run.log

  UVM_FATAL @  23098619 ps: (otbn_base_vseq.sv:348) [uvm_test_top.env.virtual_sequencer.otbn_imem_err_vseq] Check failed (!cfg.under_reset)
  UVM_INFO @  23098619 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
UVM_FATAL (otbn_base_vseq.sv:369) [otbn_rf_bignum_intg_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution has 1 failures:

Test otbn_rf_bignum_intg_err has 1 failures.
7.otbn_rf_bignum_intg_err.577101146
Line 93, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/7.otbn_rf_bignum_intg_err/latest/run.log

  UVM_FATAL @   8710818 ps: (otbn_base_vseq.sv:369) [uvm_test_top.env.virtual_sequencer.otbn_rf_bignum_intg_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution
  UVM_INFO @   8710818 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
xmsim: *E,ASRTST (/home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/default/src/lowrisc_dv_otbn_sva_*/otbn_idle_checker.sv,87): Assertion IdleIfStart_A has failed has 1 failures:

Test otbn_escalate has 1 failures.
52.otbn_escalate.2705017903
Line 94, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/52.otbn_escalate/latest/run.log

  xmsim: *E,ASRTST (/home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/default/src/lowrisc_dv_otbn_sva_0.1/otbn_idle_checker.sv,87): (time 13176420 PS) Assertion tb.dut.idle_checker.IdleIfStart_A has failed
  UVM_ERROR @  13176420 ps: (otbn_idle_checker.sv:87) [ASSERT FAILED] IdleIfStart_A
  UVM_INFO @  13176420 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---

Signatures seen in nightly but more frequent with CDC on:

UVM_FATAL (otbn_base_vseq.sv:369) [otbn_dmem_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution has 5 failures:

Test otbn_dmem_err has 4 failures.
0.otbn_dmem_err.2760602419
Line 103, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/0.otbn_dmem_err/latest/run.log

  UVM_FATAL @  12994571 ps: (otbn_base_vseq.sv:369) [uvm_test_top.env.virtual_sequencer.otbn_dmem_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution
  UVM_INFO @  12994571 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
2.otbn_dmem_err.2590234280
Line 93, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/2.otbn_dmem_err/latest/run.log

  UVM_FATAL @   7469835 ps: (otbn_base_vseq.sv:369) [uvm_test_top.env.virtual_sequencer.otbn_dmem_err_vseq] Check failed (cfg.model_agent_cfg.vif.status == otbn_pkg::StatusIdle) Timed out waiting for OTBN to be idle before execution
  UVM_INFO @   7469835 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
... and 2 more failures.
xmsim: *E,ASRTST (/home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/default/src/lowrisc_dv_otbn_sim_*/tb.sv,270): Assertion MatchingStatus_A has failed has 20 failures:

Test otbn_escalate has 18 failures.
1.otbn_escalate.4171015369
Line 113, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/1.otbn_escalate/latest/run.log

  xmsim: *E,ASRTST (/home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/default/src/lowrisc_dv_otbn_sim_0.1/tb.sv,270): (time 5045702 PS) Assertion tb.MatchingStatus_A has failed
  xmsim: *E,ASRTST (/home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/default/src/lowrisc_dv_otbn_model_agent_0.1/otbn_model_if.sv,137): (time 5045702 PS) Assertion tb.model_if.NoModelErrs has failed
  UVM_ERROR @   5045702 ps: (tb.sv:270) [ASSERT FAILED] MatchingStatus_A
  UVM_INFO @   5045702 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
3.otbn_escalate.1417624640
Line 93, in log /home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/3.otbn_escalate/latest/run.log

  xmsim: *E,ASRTST (/home/greg/work/opentitan_area/opentitan/scratch/master/otbn-sim-xcelium/default/src/lowrisc_dv_otbn_sim_0.1/tb.sv,270): (time 2205650 PS) Assertion tb.MatchingStatus_A has failed
  UVM_ERROR @   2205650 ps: (tb.sv:270) [ASSERT FAILED] MatchingStatus_A
  UVM_INFO @   2205650 ps: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---
... and 16 more failures.
rswarbrick commented 4 months ago

This is definitely needed before we can claim V3 closure but probably isn't the highest priority for the M5 milestone. Bumping to M7 accordingly.