lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.58k stars 777 forks source link

csrng fires an unexpected interrupt during plic_all_irqs_test_sim_verilator #13622

Closed drewmacrae closed 2 years ago

drewmacrae commented 2 years ago

Near the end of the plic_all_irqs_test on a verilated earl grey model, we get an unexpected csrng interrupt and the test fails.

The relative clock rates are a little different in the verilated model so it's not surprising we only see the issue in verilator.

It looks like the CSRNG is firing an interrupt the test doesn't expect long after the CSRNG's interrupts are explicitly tested. (As if it were functioning as sometime of watchdog) Is there anyone familiar enough with the CSRNG to talk me through the expected behavior sometime? I'm especially curious about any free running counters or timers that can expire.

drewmacrae commented 2 years ago

I'm not yet sure which interrupt has fired, I'm checking that and whether it's relative to csrng init or the testing of the csrng interrupts.

drewmacrae commented 2 years ago

Is there supposed to be a sort of watchdog and is there a way to "pet the dog" in the CSRNG (or address the FIFO) so it doesn't bark(or udnerflow/overflow)?

drewmacrae commented 2 years ago

It's reporting an ID of 178

kTopEarlgreyPlicIrqIdCsrngCsHwInstExc = 178, < csrng_cs_hw_inst_exc
drewmacrae commented 2 years ago

Perhaps the test should disable interrupts from the csrng after it's been tested.

drewmacrae commented 2 years ago

We could also just test the csrng near the end of the test...

mwbranstad commented 2 years ago

two likely sources of interrupt for csrng would be "cs_cmd_req_done" in which case a SW cmd has finished executing. The second would be "cs_entropy_req" where a request for a new seed from ENTROPY_SRC was made. I would expect that interrupts are not normally enabled for proper operation. There is no "watch dog" function in CSRNG.

drewmacrae commented 2 years ago

My mistake, I misread #13622, I see it's describing a separate issue regarding an earlier interrupt pulse. As I understand there's an entropy request (or many) being made here that's causing this interrupt later in the test.

tjaychen commented 2 years ago

sorry, i should have clarified. These two issues are related, they're just manifesting differently depending on test platform. Let me get some waves out to show.

drewmacrae commented 2 years ago

This is resolved by #13656