Closed milesdai closed 1 year ago
Tagging @cfrantz and @alphan for awareness.
How long ago did we start to observe this?
I only see it happening once in the last 20 runs on master.
It was somewhat recent (i.e. in the last day or so). Here are some examples:
https://dev.azure.com/lowrisc/opentitan/_build/results?buildId=105382&view=logs&j=924ff16e-cbb4-544c-ec55-782cc11682e2&t=73634ebd-f809-5cf4-5112-b0aff4a889e0 https://dev.azure.com/lowrisc/opentitan/_build/results?buildId=105370&view=logs&j=924ff16e-cbb4-544c-ec55-782cc11682e2&t=73634ebd-f809-5cf4-5112-b0aff4a889e0 https://dev.azure.com/lowrisc/opentitan/_build/results?buildId=105464&view=logs&j=924ff16e-cbb4-544c-ec55-782cc11682e2&t=73634ebd-f809-5cf4-5112-b0aff4a889e0&l=478
We can move this to a nightly CI run.
SGTM.
Hmm, #16913 and #16915 both deal with issues like this, where fully assigning cores to verilator sims starves the system of resources leading to problems at the sim-harness interface. Here it looks like the harness timed out waiting for the sim so #16913 wouldn't help, and there's a lot of overhead for something like #16915 to protect these systems in a 4 threaded environment.
The nicer sim only helps if the sim times out waiting for the harness, If we wanted to run more verilator tests in CI and we see issues like this, we should consider setting --local_test_jobs=3 to leave headroom for harnesses and overhead.
16829 removes this test. It works fine on a local workstation, but fails every couple PRs in CI. We can move this to a nightly CI run.
Logs:
``` 2022-12-14T17:59:45.8975419Z ==================== Test output for //sw/device/silicon_creator/rom/e2e/weak_straps:sw_strap_value_sim_verilator: 2022-12-14T17:59:45.8978443Z Invoking test: sw/host/tests/rom/sw_strap_value/sw_strap_value --rcfile= --logging=info --interface=verilator --verilator-bin=hw/build.verilator_real/sim-verilator/Vchip_sim_tb --verilator-rom=sw/device/lib/testing/test_rom/test_rom_sim_verilator.39.scr.vmem --verilator-flash=sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_prog_sim_verilator.fake_prod_key_0.signed.64.scr.vmem --verilator-otp=hw/ip/otp_ctrl/data/img_rma.24.vmem 2022-12-14T17:59:45.8982244Z [2022-12-14T17:56:44Z INFO opentitanlib::transport::verilator::subprocess] CWD: Ok("/root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan") 2022-12-14T17:59:45.8985683Z [2022-12-14T17:56:44Z INFO opentitanlib::transport::verilator::subprocess] Spawning verilator: "hw/build.verilator_real/sim-verilator/Vchip_sim_tb" "--meminit=rom,sw/device/lib/testing/test_rom/test_rom_sim_verilator.39.scr.vmem --meminit=flash,sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_prog_sim_verilator.fake_prod_key_0.signed.64.scr.vmem --meminit=otp,hw/ip/otp_ctrl/data/img_rma.24.vmem" 2022-12-14T17:59:45.8988282Z [2022-12-14T17:56:44Z INFO opentitanlib::transport::verilator::stdout] Simulation of OpenTitan Earl Grey 2022-12-14T17:59:45.8990103Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] ================================= 2022-12-14T17:59:45.8991933Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.8993416Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.8995052Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] Tracing can be toggled by sending SIGUSR1 to this process: 2022-12-14T17:59:45.8996745Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.8998224Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ kill -USR1 12807 2022-12-14T17:59:45.8999813Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9001067Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9003959Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] GPIO: FIFO pipes created at /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-read (read) and /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-write (write) for 32-bit wide GPIO. 2022-12-14T17:59:45.9007271Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] GPIO: To measure the values of the pins as driven by the device, run 2022-12-14T17:59:45.9009997Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ cat /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-read # '0' low, '1' high, 'X' floating 2022-12-14T17:59:45.9012282Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] GPIO: To drive the pins, run a command like 2022-12-14T17:59:45.9014826Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ echo 'h09 l31' > /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-write # Pull the pin 9 high, and pin 31 low. 2022-12-14T17:59:45.9017882Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ echo 'wh10' > /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-write # Pull pin 10 high through a weak pull-up. 2022-12-14T17:59:45.9020430Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9131176Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] SPI: Created /dev/pts/4 for spi0. Connect to it with any terminal program, e.g. 2022-12-14T17:59:45.9133202Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ screen /dev/pts/4 2022-12-14T17:59:45.9135075Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] NOTE: a SPI transaction is run for every 4 characters entered. 2022-12-14T17:59:45.9137720Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] SPI: Monitor output file created at /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/spi0.log. Works well with tail: 2022-12-14T17:59:45.9141002Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ tail -f /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/spi0.log 2022-12-14T17:59:45.9143122Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9144933Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] UART: Created /dev/pts/5 for uart0. Connect to it with any terminal program, e.g. 2022-12-14T17:59:45.9146677Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ screen /dev/pts/5 2022-12-14T17:59:45.9148412Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] UART: Additionally writing all UART output to 'uart0.log'. 2022-12-14T17:59:45.9150189Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9152727Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] USB: Monitor output file created at /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/usb0.log. Works well with tail: 2022-12-14T17:59:45.9155948Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] $ tail -f /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/usb0.log 2022-12-14T17:59:45.9158077Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9159975Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] JTAG: Virtual JTAG interface dmi0 is listening on port 44853. Use 2022-12-14T17:59:45.9162139Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] OpenOCD and the following configuration to connect: 2022-12-14T17:59:45.9164539Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] interface remote_bitbang 2022-12-14T17:59:45.9166983Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] remote_bitbang_host localhost 2022-12-14T17:59:45.9168675Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] remote_bitbang_port 44853 2022-12-14T17:59:45.9171101Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] TOP.chip_sim_tb.u_dut.top_earlgrey.u_flash_ctrl.u_eflash.u_flash.gen_generic.u_impl_generic.gen_prim_flash_banks[0].u_prim_flash_bank.unnamedblk1: ReadLatency:1 ProgLatency:50 EraseLatency:200 2022-12-14T17:59:45.9173825Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] TOP.chip_sim_tb.u_dut.top_earlgrey.u_flash_ctrl.u_eflash.u_flash.gen_generic.u_impl_generic.gen_prim_flash_banks[1].u_prim_flash_bank.unnamedblk1: ReadLatency:1 ProgLatency:50 EraseLatency:200 2022-12-14T17:59:45.9176405Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9177930Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9179623Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] Simulation running, end by pressing CTRL-c. 2022-12-14T17:59:45.9180927Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::stdout] 2022-12-14T17:59:45.9181700Z dmi0: Failed to bind socket: Address already in use (98) 2022-12-14T17:59:45.9182298Z dmi0: Unable to create TCP server on port 44853 2022-12-14T17:59:45.9183414Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::transport] Verilator started with the following interfaces: 2022-12-14T17:59:45.9195420Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::transport] gpio_read = /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-read 2022-12-14T17:59:45.9197933Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::transport] gpio_write = /root/.cache/bazel/_bazel_root/5e4f8c833cf3a422fdf6ca333dd151d7/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild/bin/sw/device/silicon_creator/rom/e2e/weak_straps/sw_strap_value_sim_verilator.runfiles/lowrisc_opentitan/gpio0-write 2022-12-14T17:59:45.9199910Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::transport] uart = /dev/pts/5 2022-12-14T17:59:45.9201071Z [2022-12-14T17:56:45Z INFO opentitanlib::transport::verilator::transport] spi = /dev/pts/4 2022-12-14T17:59:45.9202114Z [2022-12-14T17:56:45Z INFO opentitanlib::io::uart] set_flow_control to false 2022-12-14T17:59:45.9203168Z [2022-12-14T17:56:45Z INFO opentitanlib::test_utils::init] load_bitstream: success. 2022-12-14T17:59:45.9203825Z "load_bitstream": true 2022-12-14T17:59:45.9204680Z [2022-12-14T17:56:45Z INFO opentitanlib::test_utils::init] bootstrap: success. 2022-12-14T17:59:45.9205293Z "bootstrap": true 2022-12-14T17:59:45.9205796Z Starting test test_sw_strap_values... 2022-12-14T17:59:45.9206771Z I00000 test_rom.c:133] Version: earlgrey_silver_release_v5-9314-g398d74861, Build Date: 2022-12-14 17:06:49 2022-12-14T17:59:46.0526531Z 2022-12-14T17:59:46.0527163Z I00001 testFinished test test_sw_strap_values: Err(Timed Out 2022-12-14T17:59:46.0527604Z 2022-12-14T17:59:46.0528069Z Stack backtrace: 2022-12-14T17:59:46.0528696Z 0: anyhow::error::