lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.51k stars 745 forks source link

[test-triage] chip_sw_rv_core_ibex_lockstep_glitch #18062

Open engdoreis opened 1 year ago

engdoreis commented 1 year ago

Hierarchy of regression failure

Block level

Failure Description

Offending 'pend_req[d2h.d_source].pend'
  UVM_ERROR @ 1979.151724 us: (tlul_assert.sv:256) [ASSERT FAILED] respMustHaveReq_M
  UVM_INFO @ 1979.151724 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---

Steps to Reproduce

Tests with similar or related failures

johngt commented 1 year ago

@vogelpi - you previously worked on lockstep_glitch. Depending on your loading we can look at assigning this to someone else but think you are best served to look at this for now.

johngt commented 1 year ago

Based on yesterdays meeting I've changed assignee and lowered the priority.

johngt commented 1 year ago

Known flaky test @moidx @msfschaffner @GregAC - suggest keeping this but marking as a V3 item

nbdd0121 commented 4 months ago

This is failing again in the latest DV run.

DV report: https://reports.opentitan.org/hw/top_earlgrey/dv/latest/report.html

0.chip_sw_rv_core_ibex_lockstep_glitch.67815085953951491321756871088796420206524631258624933942405568744189784932439
    Line 785, in log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/0.chip_sw_rv_core_ibex_lockstep_glitch/latest/run.log

          Offending '(d2h.d_opcode === ((pend_req[d2h.d_source].opcode == Get) ? AccessAckData : AccessAck))'
      UVM_ERROR @ 3053.637184 us: (tlul_assert.sv:258) [ASSERT FAILED] respOpcode_M
      UVM_INFO @ 3053.637184 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
      --- UVM Report catcher Summary ---

It looks like it is worth investigating.

vogelpi commented 2 months ago

This test mostly passes. But here and there a failure pops up. We shouldn't worry about this. Modeling the behavior of the lockstep core under FI is a hard problem. Especially if the faults are introduced into the ICACHE. It may be that errors never get detected because the core just doesn't load the glitched line etc. We shouldn't invest any effort into debugging this right now.

andreaskurth commented 2 months ago

Just discussed in triage: We agree with the assessment above and are moving this to M7

jwnrt commented 1 month ago

This is still failing but the check error is different:

  UVM_FATAL @ 2422.875812 us: (chip_sw_rv_core_ibex_lockstep_glitch_vseq.sv:714) [uvm_test_top.env.virtual_sequencer.chip_sw_rv_core_ibex_lockstep_glitch_vseq] Check failed alert_major_internal == exp_alert_major_internal (0 [0x0] vs 1 [0x1]) Major alert did not match expectation.
  UVM_INFO @ 2422.875812 us: (uvm_report_catcher.svh:705) [UVM/REPORT/CATCHER]
  --- UVM Report catcher Summary ---