lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.58k stars 774 forks source link

[kmac] Entropy error code #15088

Open cindychip opened 2 years ago

cindychip commented 2 years ago

When reviewing KMAC entropy related error code, here is one concern we want to flag: There are two KMAC entropy error code: WaitTimerExpired and IncorrectEntropyMode. Currently both of them will flag an KMAC error.

However, the concern here is: the current KMAC error does not block any incoming operations. So with these error, we believe KMAC will continue to process all the incoming operations with the existing entropy. We think this might have some potential issue and at least we should flag an error on keymgr app interface and stop the process there.

@eunchan please feel free to elaborate more. @jadephilipoom @johannheyszl @cdgori please let us know if you have any other thoughts.

Thanks, Cindy

eunchan commented 2 years ago

In Top Earlgrey, we have a few assumptions that may reduce this risk.

  1. App interface (KeyMgr) is initiated by SW always. So, any entropy events SW receives could be assessed and processed prior to the next App request.
  2. Incorrect Entropy Mode case, we can assume it is due to FI attacks. If SW is compromised (kernel SW), then there's no way KMAC avoid manipulations. So, in that case, SW is still able to receive the error event prior to initiate KDFs.
  3. If an userspace SW has been compromised, the error is being received by the kernel SW. so no problem so far.

With assumptions above, still the risk is quite low in my opinion. However, the design as itself has weakness here. So it is better for KMAC to report errors via App interface in case of entropy errors occur. There's no direct error report from kmac_entropy to kmac_app module yet.

jadephilipoom commented 2 years ago

Just to check that I've understood the comments above properly:

@cindychip @eunchan Is that understanding accurate?

eunchan commented 2 years ago

@jadephilipoom Exactly!

johannheyszl commented 1 year ago

Results from the discussion in Sec WG 12/08:

(1) SW needs to check the error in KMAC before using it through key manager; document in SW guidelines.

(2) Mentioned errors are currently not reported on app interface to prevent any unwanted lock-ups; the error reporting on app interface needs to be extended for future versions where no/less SW interaction is available.

cindychip commented 1 year ago

Thank you Johann for summarizing the conclusion. @timothytrippel could you help us update the SW guidelines? @vogelpi could you help update the kmac doc?

vogelpi commented 1 year ago

Hi @cindychip , yes, I can help with the doc but let's wait with documentation content changes until the doc refactor has landed. There will be a large amount of rebasing needed and if we introduce documentation changes in the meantime, there is a non-negligible risk to loose such changes.

cindychip commented 1 year ago

Totally makes sense, thanks Pirmin.

ballifatih commented 1 year ago

estimate 16

johngt commented 1 year ago

@cindychip / @vogelpi - I see this moved up from P3 to P2? Please confirm as it's not easy to see with the labelling system here.

cindychip commented 1 year ago

@johngt @ballifatih for M2.5, this task only needs documentation change. (I think might take 2 hours max?) --> P2 for M2.5 For future release, we can consider adding a error reporting mechanism for interface. ---> future release.

If it is easier for tracking, we can split into two issues?

Reference: https://github.com/lowRISC/opentitan/issues/15088#issuecomment-1346193920

ballifatih commented 1 year ago

Sounds reasonable @cindychip, I forked a new issue to track SW/doc changes which are relevant for M2.5.

We can keep this issue around for the RTL changes for FutureRelease. I will label it accordingly.

msfschaffner commented 9 months ago

@vogelpi can you take a look at this and make an assessment as to whether we should fix this for PROD? Thanks!

vogelpi commented 9 months ago

I had a look at his earlier this week. My view is that the key thing is to document how software should check those error codes. Reporting such errors on the app interfaces is desirable whenever hardware can trigger such operations but it may lead to unwanted side effects including blocking behavior. For Earlgrey, it might be safer to not do it in HW. We should definitely review this in the Integrated context.

johannheyszl commented 9 months ago

Agree w/ @vogelpi (and others before) to not change RTL for Earl Grey. Future Release / Integrated and close as not planned are all fine for me.

SW requirement is documented in #17785

vogelpi commented 9 months ago

Thanks for your feedback @johannheyszl , let's keep this open but in the Integrated context.