Closed msfschaffner closed 8 months ago
Thanks for filing the issue @msfschaffner . I am currently working on this. The task can be broken down into the following steps:
kmac.sv
also containing the TL-UL interface which is not practical for evaluation. Instead we need a reduced design version containing just the processing engine + PRNG).Update: I've done the switch of the PRNG implementation and also the DV integration is mostly done. The FPGA evaluation is tricky and ongoing.
I've now updated the total effort to also include the DV / full security verification effort.
Update, I've also investigated the area and timing impact of the change using the Yosys + nangate45 synthesis flow. The situation is as follows:
Baseline | Reworked core + Buffer | Reworked core + Buffer + Bivium | Reworked core + Buffer + Trivium | |
---|---|---|---|---|
area | 180 kGE | 187 kGE | 190 kGE | 200 kGE |
critical path | 1.62 ns | - | 2.13 ns | 2.08 ns |
max clock freq | 617 MHz | - | 469 MHz | 480 MHz |
The bottom line is that the critical path increases and is now inside the PRNG itself (because it's heavily unrolled - we generate 800 bits per clock cycle) but it's still very far away from the critical path of other IP cores and the whole chip. From an area perspective, Bivium would be 10 kGE smaller than Trivium. But since KMAC processes sensitive key material for keymgr we should probably go for Trivium to be on the safe side.
From a SCA perspective, PROLEAD suggests that anything with the reworked core and the added buffer stage is slightly better than the baseline but there are no big differences. I am now working on the the FPGA experiments.
The DV work is now done and I also got the FPGA results for SHA3. It looks fantastic from an SCA perspective. KMAC mode experiments are currently running and I'll feed back once I have those results. The PR to integrate Trivium is here: https://github.com/lowRISC/opentitan/pull/21624
@johngt the work for this is now done. I'll guess it will take some time for people to read through the PR and digest it. I'll answer questions of course but my main focus is now on other pressing issues FYI.
Update: As part of #22025 , we've switched to the Bivium implementation in #22021. All SCA experiments (including FPGA TVLA for SHA3 and KMAC) were repeated without the results changing notably.
Description
This is a sibling issue to #19091. A similar PRNG improvement for KMAC should be evaluated.