lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.6k stars 783 forks source link

[test-triage,rom_e2e,opentitanlib] gdb/openocd/jtag tests flaky/failing #17729

Closed engdoreis closed 1 year ago

engdoreis commented 1 year ago

Hierarchy of regression failure

Chip Level

Failure Description

The ROM E2E tests that use the rv_dm + openocd + gdb are shaky lately. image

The reason should be investigated, and the tests should be improved.

Steps to Reproduce

Tests with similar or related failures

pamaury commented 1 year ago

I think I have been to track down the issue: the CW310 backend always reconfigures the reset and strap pins on init. At the same time, the gdb coordinator script starts the opentitantool console after starting OpenOCD. This means that when it initializes the CW310 backend, this will mess up with the pins for a few milliseconds.

I have been able to confirm in tests that this is a failure mode that we have observed. I am currently running tests in CI to see if this solves the issue completely or not.

timothytrippel commented 1 year ago

Awesome work everyone! Since we do not use the gdb coordinator script for manufacturing (for the complexity reasons) this should not be an issue.

Long term, as a lot of us have discussed, we should remove the gdb coordinator script and the custom gdb/openocd Bazel test rule and move the GDB coordination into opentitantool, since we don't need it for writing custom tests now that we drive OpenOCD with opentitanlib directly.

timothytrippel commented 1 year ago

can we close this now? @pamaury @andreaskurth

timothytrippel commented 1 year ago

nevermind I see #18554 links to close this when merged. Thanks everyone!

a-will commented 1 year ago

After the freeze, I can also remove the POR connection to SRSTn on the ARM JTAG connectors. The POR functionality was documented, but when I hooked it up, I didn't anticipate the confusion that would cause, nor how it might get driven from tools that think it's ARM's SRSTn.

The commit for this is https://github.com/a-will/opentitan/commit/414cd10d113263843b55d4aedc76bd70c3108790, but I won't create a PR for that right now (to avoid the added load to CI).

msfschaffner commented 1 year ago

Thanks everyone for helping to root cause this!