[tlul] Side-channel Hamming weight leakage of `data` on TL-UL

ballifatih commented 1 year ago

TODOs

[ ] Implement SW guidelines described in this comment.

Original Description

I would like to get some security opinion about possible side-channel leakage on TL-UL transactions.

In some TL-UL transactions between Ibex and crypto HWIPs, the data part of the transaction is reset to 0. If the side-channel leakage caused by the TL-UL bus has good correlation with the Hamming distance, then I suspect the Hamming weights of the secrets passed with TL-UL transactions might be exposed to an attacker. I think double transition from 0 to data amplifies this effect (0x0 -> data -> 0x0). From side-channel perspective, to me it seems like keeping the last sent value on the data significantly increases the difficulty of recovering the value of each individual word of a secret. I guess resetting data to 0 also has its benefits, but I am not able to see all angles of such a trade-off.

Since we are using peripheral connections to pass secrets among HWIPs, most keys are already immune to this. However, there are still some keys that are passed over TL-UL (not the exhaustive list):

Keymgr generated SW keys,
Keymgr generated identity seed (if they are passed to OTBN through SW),
SW generated/managed symmetric keys/secrets for AES/OTBN/KMAC/HMAC HWIP.

This observation (0x0 -> data -> 0x0) is not consistent on all sides of xbar, and I only looked at two examples.

In the first waveform, Ibex is reading identity seed (target=SW) from keymgr:

The secret word in this example is 75e7_b7ac,
keymgr TL-UL output: data_prev -> data_next so the previous value is kept on data,
Ibex TL-UL input: 0x0 -> data -> 0x0 so data is reset to 0 after transition.

keymgr_reading_simplified

Ibex is writing to key to AES:

The secret word in this example is c11e_955a,
Ibex TL-UL output: 0 -> data -> 0x20 (I don't understand why 0x20 is loaded into data on Ibex side since there is no transaction),
AES TL-UL input: 0 -> data -> 0.

TLUL-aes-leakage

cc: @johannheyszl @jadephilipoom @bilgiday @gdessouky

johannheyszl commented 1 year ago

Thanks @ballifatih, cc @moidx

TL;DR:

TL-UL data confidentiality vs SCA
discussion of potential sources of SCA leakage
peripherals seem to leave data on their output
bus multiplexer switches between inputs and then back to zero after read op

I'd assume that most values are in shares; @ballifatih

re key manager created keys that are read over TL-UL: Are those in shares?
re SW keys that are passed into HW-IP: They are also in shares I assume?
re key manager generated identity seed - same, is this in shares?

generally, IMO if we keep any old, potentially sensitive, values on the bus, through switching, we might create even more instances of Hamming distance between either zeros or other values.

johannheyszl commented 1 year ago

@vogelpi for viz

ballifatih commented 1 year ago

@johannheyszl AFAIU all these secret values are sent in two shares over TL-UL (I can see related CSRs have two shares). Since each word of the two shares are sent in sequential TL-UL transactions, I think it makes sense to assume that the attacker can read HW of both shares. I can see two follow-up discussion points:

Is 0 -> data -> 0 is really worse than data_prev -> data_next transition from SCA perspective?
Are two shares sent in sequence enough to prevent SCA? For 32-bit two words X and Y, if the attacker gets both HW(X) and HW(Y), then how much recovering advantage is obtained on X XOR Y? In particular, one should note that the attacker will get different (HW(X), HW(Y)) values for the same X XOR Y during each observation.

p.s. I am using HW(X) to refer to Hamming weight of the value X.

johannheyszl commented 1 year ago

thx.

Sharing is fresh after each power-up or ideally on every access?
Yes, zero to value is IMO worse than distance between values.
For masked values, a direct succession of masked value and mask on the same bus is not nice.

ballifatih commented 1 year ago

In the case of ID generation by keymgr, the randomness comes from KMAC. Each power-up should have fresh randomness. Each new invocation of generate-ID should also have fresh randomness. Once this ID is generated, it is stored in CSR, so reading it multiple times from CSR should return the secret with same masks. It's harder to guess what happens when SW generates/controls the secret key, then writes it to one of the crypto HWIPs.
Agreed, open to discussion.
Since both shares are read from registers, I think the ordering among words can be changed. SW can even interleave reading secret key/identity with other non-secret TL-UL transactions (not suggesting we should do that). The order of words can also be randomized.

johannheyszl commented 1 year ago

thx! nice, so:

averaging of traces: attacker may average if masked values are transferred over bus multiple times. if that is happening it is known from open source code.
randomizing order of bus tranfers: SW can randomize all bus accesses for multi word data and shares. this should de-facto prevent averaging.

tjaychen commented 1 year ago

hey all, could you shed some more light on how the shares read in sequence creates and issue? Is the basic idea that the bus is narrower (fewer bits toggling), so it would be easier for an attacker to figure out the hamming weight? Secondly, assuming the register can be read multiple times (from keymgr), is the idea of averaging to reduce the noise from other parts of the bus so that the HW of the bus values can be surfaced?

Lastly, I am unsure now if this helps or hurts, but the software output registers from keymgr are actually "read clear". Meaning you cannot actually repeatedly read them. But it also means after every read there is a "value" -> "0" transition.

tjaychen commented 1 year ago

the 0 transition on the ibex probably has more to do with how the tlul sockets are constructed.. ie, for a peripheral that is not selected, all of its inputs probably just get blanked.

johannheyszl commented 1 year ago

thx tim. Our gut feeling is that we will likely not have an issue here. We will discuss today in the SCA sub WG. We might put a leakage test on the post-silicon test plan to make sure if we think its necessary.

re shares in sequence: if in any of the TL-UL registers or other, shares are loaded through FFs in sequence, the occurring Hamming distance would be equal to the Hamming weight of the unmasked value. But this is only if e.g. word 0 from share 0 is succeeded by word 0 from share 1. If reading all words from share 0 then all of share 1, this is IMO not an issue.

re averaging: Repeating through reading multiple times, allows averaging out of noise factors such as electrical noise in measurement chain, and noise signal from uncorrelated logic/functionality on OT. Experience shows that attacks on such wide words only ever succeed if averaging is possible to get 'good samples' for for template matching. All correlated noise remains of course. If the sequence of words is randomized, averaging is not possible, which is nice :)

tjaychen commented 1 year ago

sounds good, should this become software guidance then? it sounds like two things..always process 1 share fully ahead of the other. And within that share, randomize the sequence. This probably means we can't have any fifo like structures to store the keys (i dont think we do), but it might be something we will have to double check.

ballifatih commented 1 year ago

Summarizing some points from OT-SCA meeting:

0 -> data -> 0 behavior is not devastating. In the worst case, through template attacks, with many collected power traces, the attacker might be able to get the Hamming weight of each 32-bit chunk of a secret. Even then, this does not give out too much information on the full key.
data_prev -> data_next behavior is undesired for other reasons, like reducing the exposure of this value sitting on the data port to fault injection (FI) attacks or invasive physical probing. In short, there is a benefit in shortening the exposure time of a sensitive value on the bus as pointed out by @vogelpi and @cdgori.

And on the SW guideline side:

Avoid reading/writing secrets in 8-bit or 16-chunks.
Reading/writing shares in alternating manner is probably bad. Process one share fully and then move to another.
As @johannheyszl suggested randomizing the loading order of key words might be an additional counter-measure that we can implement on SW side, if needed later.
As @vogelpi and @bilgiday pointed out, feeding some random values from an LFSR post-transaction is also an idea we can keep on the side for now.

What remains is to check whether TL-UL adapters are behaving as intended. Two unexpected observations:

Why do we see data_prev -> data_next on the keymgr TL-UL output?
What is the value 0x20 that leaks to TL-UL data port from Ibex side?

I will look at these small TL-UL inconsistencies again and create a spin-off issue for those.

vogelpi commented 1 year ago

Thanks @ballifatih for starting this discussion and preparing the ot-sca meeting. It's an interesting and relevant topic I believe. I fully agree with your summary above.

On a side note, inside the entropy complex data_prev -> data_next is preferred over 0 -> data -> 0 because there we don't have spurious write enable protection and latching in any deterministic value downstream e.g. through FI would be very bad. But you summarized in your comment above, for the TL-UL bus things are different.

andreaskurth commented 1 year ago

Triaged for tlul:

What remains is to check whether TL-UL adapters are behaving as intended. Two unexpected observations:
* Why do we see `data_prev -> data_next` on the `keymgr` TL-UL output?

* What is the value `0x20` that leaks to TL-UL `data` port from Ibex side?
I will look at these small TL-UL inconsistencies again and create a spin-off issue for those.

@ballifatih: Could you please link the issue here? Do your findings there agree with the following:

IIUC the discussion above, we'll resolve this issue with SW guidelines post M2.5 but don't need to take action for M2.5. If so, I'd tag this https://github.com/lowRISC/opentitan/labels/Type%3AIcebox. @vogelpi: Do you agree?

ballifatih commented 1 year ago

Sorry @andreaskurth, I couldn't get back to this issue to spin off the relevant discussion. Here it is #17330, so that we can isolate the TL-UL discussion from the SCA/security discussion.

Feel free to close this issue @andreaskurth and use the new one.

andreaskurth commented 1 year ago

Thanks @ballifatih (and no worries :slightly_smiling_face: )!

From your summary above, I think

And on the SW guideline side:

* Avoid reading/writing secrets in 8-bit or 16-chunks.

* Reading/writing shares in alternating manner is probably bad. Process one share fully and then move to another.

* As @johannheyszl suggested randomizing the loading order of key words might be an additional counter-measure that we can implement on SW side, if needed later.

* As @vogelpi and @bilgiday pointed out, feeding some random values from an LFSR post-transaction is also an idea we can keep on the side for now.

is still open and tracked by this issue. So I would keep this issue open to track the completion of the SW guidelines. I'm changing the labels accordingly and will https://github.com/lowRISC/opentitan/labels/Type%3AIcebox it because non-ROM SW can be done post M2.5. @alphan: I think ROM code already adheres to those SW guidelines, right?

Let's continue the TL-UL hardware discussion in #17330.

johannheyszl commented 9 months ago

@jadephilipoom this is an issue with items for SW security guidelines (which I think are already covered). Let's close if redundant. thanks

lowRISC / opentitan

[tlul] Side-channel Hamming weight leakage of `data` on TL-UL #16767

TODOs

Original Description