[chip-test] chip_sw_entropy_src_cs_aes_halt

moidx commented 2 years ago

Test point name

chip_sw_entropy_src_cs_aes_halt

Host side component

No response

OpenTitanTool infrastructure implemented

No response

Contact person

martin-lueker

Checklist

Please fill out this checklist as items are completed. Link to PRs and issues as appropriate.

[ ] Check if existing test covers most or all of this testpoint (if so, either extend said test to cover all points, or skip the next 3 checkboxes)
[ ] Device-side (C) component developed
[ ] Bazel build rules developed
[ ] Host-side component developed
[ ] HJSON test plan updated with test name (so it shows up in the dashboard)
[ ] Test added to dvsim nightly regression (and passing at time of checking)

estimate 8 remaining 2023-04-04 4

tjaychen commented 2 years ago

i'll take this for now. The plan is to just add a top level assertion to check for this behavior. Most of the edn req test can be re-used, but we want to update that to do reseeds more frequently so we can create a conflict situation between sha3 and aes.

tjaychen commented 2 years ago

unassigning to work on other high priority tests.

drewmacrae commented 1 year ago

~The edn_req_test currently configures the mask_reseeding field with 0, which has the effect of setting the AES to reseed on every block. We should be able to use the test as-is unless we want to change the edn_req_test to use less entropy and run faster.~ I misinterpreted the description and discussion of this testpoint. @sriyerg has informed me the CSRNG also has an AES block and this test is meant to exercise that block to make it halt instead.

drewmacrae commented 1 year ago

The only way I can see here to do reseeds is to call it in the dif: dif_csrng_reseed

ballifatih commented 1 year ago

Is anyone actively working on this test (in case assignees field is not up to date)?

drewmacrae commented 1 year ago

@sriyerg discussed updating the testplan, but I haven't heard any discussion of writing the sv assertion and seeing if we hit it.

martin-lueker commented 1 year ago

Hi Fatih, I see that @drewmacrae has updated the definition very recently. I think this can be incorporated into either chip_sw_entropy_src_csrng, or the unit-level testbench.

If done in chip_sw_entropy_src_csrng this would just require the addition of an assertion which insists that when in FIPS mode every output between the entropy_src and the csrng get a four-phase handshake on the cs_aes_halt if before outputting any thing to CSRNG. There are two tricky parts to consider:

The assertion should not be enabled when the entropy_src is not in FIPS mode.
There may be delays between the AES handshake and the seed output as there is a queue internal to the entropy_src so you could get as many as 4 handshakes before you see your first seed transferred between the two IPs.

Aside from this additional assertion requirement here, there is nothing in the SW setup that is special outside this other chip test.

As an alternative approach I posit that this could be moved to the Unit entropy_src testbench: Given that the entropy_src should never output FIPS mode seeds without this handshake, the mere passing of chip_sw_entropy_src_csrng, would signify that the handshake is working. The proof then remains to guarantee that the entropy_src always respects this requirement. In the unit testbench one can either:

Disable the cs_aes_halt handshake and confirm that no seeds come out (not very satisfying as it is really just a null test) OR
Add functionality to the scoreboard to track that every FIPS processing event is preceded by a handshake event (which can be captured from the agent generating the handshake). On can keep a counter of the unclaimed handshakes and trigger a failure if that count goes below zero.

Perhaps It's because I'm no expert at writing elegant assertions, but given the complexities of writing this particular assertion, I think the scoreboarding option 2, is probably the easiest (assuming unit level verification is satisfactory).

martin-lueker commented 1 year ago

The check that every FIPS seed is preceded by a CS_AES_HALT handshake will be added with https://github.com/lowRISC/opentitan/pull/16200

With this we can probably at least simplify the need for the assertion, (or perhaps even no longer need it?). With chip_sw_entropy_src_csrng the generation of the seeds will imply that something is giving the entropy_src a handshake. The one check might be to just do a continuity check to make sure that the handshake is actually coming from AES. What do people think?

ballifatih commented 1 year ago

Thanks for the detailed description @martin-lueker, and thanks @sriyerg for the recent update on the test plan #16186. I will need some time to digest this (given my unfamiliarity with these IPs). Meanwhile, I am self-assigning the test.

ballifatih commented 1 year ago

Removing myself from the assignment, as I don't think I will be working on this soon.

vogelpi commented 1 year ago

I think we definitely want to cover this for M2.5. Whether this should be done as a TLT or block-level tests for entropy_src and csrng, we shall see. It's definitely not related to the aes block. Instead, csrng features an internal aes engine and this TLT is about the synchronization of this engine with entropy_src.

andreaskurth commented 1 year ago

Triaged for csrng, https://github.com/lowRISC/opentitan/labels/Priority%3AP2 for M2.5 SGTM.

marnovandermaas commented 1 year ago

Effort estimate for M2.5 captured in entropy source:

estimate 8

andreaskurth commented 1 year ago

This feature involves entropy_src and csrng. It can be tested at the block level with assertions (which can be active in block- and top-level tests):

entropy_src may run its SHA3 module (mostly the Keccak round) only while cs_aes_halt request (which entropy_src controls) and acknowledge (which csrng controls) are high.
csrng may run its AES module only while cs_aes_halt request or acknowledge are low.

(There are a few other assertions, such as that acknowledge may not drop before request drops, but those can be covered by generic push/pull interface assertions.)

I've added an assertion for (1) to entropy_src and have to report that entropy_src violates this condition. The condition essentially is only guaranteed when entropy_src triggers the final SHA3 absorbing process through sha3's process_i. There are other scenarios in which a Keccak round is run (triggered through keccak_round's run_i input) and for which entropy_src does not request cs_aes_halt. These scenarios can be identified based on the states of entropy_src's main FSM:

continuous health tests (state ContHTRunning);
firmware override mode (states FWInsertMsg, Sha3Prep (after FWInsertMsg));
startup (states StartupPhase1, StartupPass1, StartupFail1).

(I've also seen the assertion fail in the states Idle -- as a consequence of getting disabled during an operation -- and AlertHang, but I suspect those are the effect of one of the aforementioned scenarios.)

I've tried to fix these scenarios in entropy_src's main FSM. However, this led to:

cs_aes_halt request being high much more often/longer than it has to be, during firmware override mode, therefore adding a risk to delay or even stall CSRNG operation;
stalls between the pre-conditioner FIFO and the conditioner FIFO, as the control logic for them depended on cs_aes_halt being low;
violations of the req/ack protocol for cs_aes_halt;
entropy bits from the AST noise source getting dropped, while waiting for cs_aes_halt to get acknowledged.

I think entropy_src could be changed to always perform a cs_aes_halt handshake before it runs its Keccak round. However, I think properly implementing this would require intricate RTL changes, a reorganization of how entropy_src's SHA3 instance is controlled in concert with the disabling mechanism, the firmware override interface, and the cs_aes_halt handshake, and a careful evaluation of what this means for dropping raw entropy bits (and therefore manipulating entropy sampling) or how that could be prevented with appropriately sized buffers and an upper bound on cs_aes_halt acknowledge after request.

I think such changes are out of scope for this release. This means Earlgrey's power distribution has to be dimensioned such that entropy_src's SHA3 and csrng's AES can run at the same time. This should not be difficult as csrng's AES has only around 10 kGE that can draw power (which is many factors smaller than, e.g., OTBN). We should make sure the maximum power test covers this scenario.

Here's a closed draft PR with the assertions I wrote and the RTL changes I tried: https://github.com/lowRISC/opentitan/pull/17831.

I suggest we close this issue with a PR that removes the chip_sw_entropy_src_cs_aes_halt testpoint from Earlgrey's testplan and create a new https://github.com/lowRISC/opentitan/labels/Type%3AFutureRelease issue that tracks the full implementation of cs_aes_halt in entropy_src.

vogelpi commented 1 year ago

Many thanks @andreaskurth for investigating this and reporting back!

I fully agree with you that changing the design to always halt AES before running the SHA3 is out of scope for this release.

What I am wondering is whether running a single SHA3 round here and there while CSRNG is busy might be okay. Do you have an idea how often this happens and how far away the individual run_i pulses are spaced?

andreaskurth commented 1 year ago

What I am wondering is whether running a single SHA3 round here and there while CSRNG is busy might be okay. Do you have an idea how often this happens and how far away the individual run_i pulses are spaced?

How often this happens: I got the impression* that there's a cs_aes_halt handshake before a Keccak round run about 90% of the time, maybe even more. So I'd say in the majority of cases it works as intended but I wouldn't rely on that to always be the case.

How far the individual run_i pulses (and thus Keccak round runs) are spaced in time: About* 1000 clock cycles.

⁣* Disclaimer: Those are estimates from checking a few waves, not a full statistical analysis over a representative sample set.

vogelpi commented 1 year ago

Thanks @andreaskurth for getting back. It's good to have a rough idea on how frequently this happens. I think anything beyond 100 clock cycles should be fine. Because a full SHA3 would run for 24 clock cycles and the AES-256 inside CSRNG takes 16 clock cycles. IIRC a generate command will take 3 or 4 AES runs. Even if it would happen once every 100 cycles it wouldn't create a huge overlap.

andreaskurth commented 1 year ago

Yes, I'll add some assertions to ensure that this does not happen too often.

lowRISC / opentitan