Closed johngt closed 1 year ago
This looks like a package / tarball issue, which may mean a file has failed to download completely. Given it's a software task currently, assigning to @engdoreis
Command to reproduce.
./util/dvsim/dvsim.py hw/top_earlgrey/dv/chip_sim_cfg.hjson -w -i chip_sw_all_escalation_resets
A quick update, I discovered that an exception is happening inside this funciton when tested IP is the flash_ctrl just before the sv code triggers the fault.
/**
* Logs `log` and the values that follow in an efficient, DV-testbench
* specific way, which bypasses the UART.
*
* @param log a pointer to log data to log. Note that this pointer is likely to
* be invalid at runtime, since the pointed-to data will have been
* stripped from the binary.
* @param nargs the number of arguments passed to the format string.
* @param ... format parameters matching the format string.
*/
void base_log_internal_dv(const log_fields_t *log, uint32_t nargs, ...) {
mmio_region_t log_device = mmio_region_from_addr(kDeviceLogBypassUartAddress);
mmio_region_write32(log_device, 0x0, (uintptr_t)log);
va_list args;
va_start(args, nargs);
for (int i = 0; i < nargs; ++i) {
mmio_region_write32(log_device, 0x0, va_arg(args, uint32_t));
}
va_end(args);
}
@luismarques for visibility
This test is doing the following for several IPs:
.c
dv.sv
that it is ready for a fault injection and then execute a wfi
.The test is failing only for the flash_ctrl
IP and after deep investigation I discovered that the Ibex is triggering an exception cause=01 (instruction access fault), at the instruction addi sp,sp,32
and exactly at the time that the flash_ctrlr fault is injected.
After discussion with @GregAC we came to the conclusion that the flash is blocking instructions from being fetched due to the fault.
Here are some suggested approaches to tackle this issue:
@tjaychen @matutem Plese let me know your thoughts.
@engdoreis did you investigate for this error message? The build error is gone now.
UVM_ERROR @ 8760.268292 us: (sw_logger_if.sv:522) [all_escalation_resets_test_prog_sim_dv(w/device/tests/sim_dv/all_escalation_resets_test.c:915)] CHECK-fail: Expected at least one regular interrupt
Yes, exactly this error.
Yes, exactly this error.
Thanks @engdoreis. Let me reassign to @matutem to fix it.
@tjaychen and @matutem discussed a couple options:
The first option does not address all issues since the test also records whether the regular interrupt and NMI were received, and that is recorded in flash. Also, the ISRs check the alert is recorded correctly.
Option 2 has the advantage that the alert crash dump provides extra confirmation the alert is correct, so it benefits all cases, and reduces the need of having run the ISRs. Notice however all errors excerpt for those in flash_ctrl will check that the ISRs were run and all checks they perform were successful. In case of flash_ctrl errors the test will ignore the check for ISRs having run.
Hierarchy of regression failure
Chip Level
Failure Description
Test chip_sw_all_escalation_resets has 2 failures. 17.chip_sw_all_escalation_resets.1083579376 Log /container/opentitan-public/scratch/os_regression/chip_earlgrey_asic-sim-vcs/17.chip_sw_all_escalation_resets/latest/run.log
Error in extract: java.io.IOException: Error extracting /root/.cache/bazel/_bazel_default/96f40d218badc03ed33c48f246ec2aa1/external/rust_linux_x86_64/rust-1.60.0-x86_64-unknown-linux-gnu.tar.gz to /root/.cache/bazel/_bazel_default/96f40d218badc03ed33c48f246ec2aa1/external/rust_linux_x86_64: Unexpected end of ZLIB input stream ERROR: /workspace/mnt/repo_top/sw/host/opentitantool/BUILD:10:12: //sw/host/opentitantool:opentitantool depends on @rust_linux_x86_64//:toolchain_for_x86_64-unknown-linux-gnu_impl in repository @rust_linux_x86_64 which failed to fetch. no such package '@rust_linux_x86_64//': java.io.IOException: Error extracting /root/.cache/bazel/_bazel_default/96f40d218badc03ed33c48f246ec2aa1/external/rust_linux_x86_64/rust-1.60.0-x86_64-unknown-linux-gnu.tar.gz to /root/.cache/bazel/_bazel_default/96f40d218badc03ed33c48f246ec2aa1/external/rust_linux_x86_64: Unexpected end of ZLIB input stream ERROR: Analysis of target '//sw/device/tests/sim_dv:all_escalation_resets_test_sim_dv' failed; build aborted: INFO: Elapsed time: 704.250s INFO: 0 processes. FAILED: Build did NOT complete successfully (244 packages loaded, 13962 targets configured) FAILED: Build did NOT complete successfully (244 packages loaded, 13962 targets configured) make: *** [/workspace/mnt/repo_top/hw/dv/tools/dvsim/sim.mk:73: sw_build] Error 1
Steps to Reproduce
This test has been failing over the last 5 days but gets to late 90s percentage. Sept 12 / Sept 11 / Sept 10 / Sept 9. 0% was previous run on Sept 7 (Missing Sept 8) GH Commit: https://github.com/lowrisc/opentitan/tree/3c54b1eb225f685fe528a88992a1b8de3e30cd4b Build seed: 3969941873
Last day with complete failure was https://reports.opentitan.org/hw/top_earlgrey/dv/2022.09.07_23.52.23/report.html
Tests with similar or related failures