lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.57k stars 771 forks source link

[test-triage] chip_sw_lc_walkthrough_dev chip_sw_lc_walkthrough_prod chip_sw_lc_walkthrough_rma #14389

Closed johngt closed 2 years ago

johngt commented 2 years ago

Hierarchy of regression failure

Chip Level

Failure Description

Job chip_earlgrey_asic-sim-vcs_run_default killed due to: Exit reason: User job exceeded runlimit: User job timed out

These are new tests introduced August 16 so are probably known issues, just tagging for completeness.

Test chip_sw_lc_walkthrough_dev has 1 failures.

0.chip_sw_lc_walkthrough_dev.492655461
Log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/0.chip_sw_lc_walkthrough_dev/out/run.log

  Job ID: smart:075ee8df-83e2-4bc8-9f5c-cd0016357bab
Test chip_sw_lc_walkthrough_prod has 1 failures.

0.chip_sw_lc_walkthrough_prod.3471619446
Log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/0.chip_sw_lc_walkthrough_prod/out/run.log

  Job ID: smart:28fed6e9-4457-46c2-8466-439e4b4943d2
Test chip_sw_lc_walkthrough_rma has 1 failures.

0.chip_sw_lc_walkthrough_rma.3744925011
Log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/0.chip_sw_lc_walkthrough_rma/out/run.log

  Job ID: smart:a4a402a8-8937-4ffb-b6a3-77382c76c4f4

Steps to Reproduce

Occurs on Aug 16 and Aug 17 runs August 17 2022 07:06:16 UTC

cab5ffb90 Build seed: 3773991536 VCS

Tests with similar or related failures

cindychip commented 2 years ago

These tests need more time because they have RMA transition. I will add a fix soon. Thanks for adding the issue.

tjaychen commented 2 years ago

These tests appear to be still be failing as of 2022-08-22, so they will need to be re-triaged.

tjaychen commented 2 years ago

assigning for triage.

tjaychen commented 2 years ago

There appears to be two layered failures.

  1. the first is that the test takes too long because of flash rma wipe (it takes 100+ms without specific run time options)
  2. the second is that otbns ImemRDataBysEnabkedWhenNoCoreAccess fires because the assertion condition does not account for an active ongoing secure wipe. @andreaskurth @vogelpi

The second issue might already be known.

andreaskurth commented 2 years ago

otbns ImemRDataBysEnabkedWhenNoCoreAccess fires because the assertion condition does not account for an active ongoing secure wipe.

Indeed, thanks for pointing this out, @tjaychen. Let's track this in #14602.

tjaychen commented 2 years ago

Needs #14606 to also be merged before this is completely fixed.

tjaychen commented 2 years ago

un-assigning for next on-call to confirm merge and close.

tjaychen commented 2 years ago

the flash_rma_unlocked test is also failing because of the same assertion, so #14606 should fix a lot of things.