SRAM PUF: Not using cryptographic hash function to collect entropy

maribu commented 4 years ago

Description

I noticed some security related issues with the SRAM PUF. But as the feature is explicitly marked as experimental and not ready for daily use, I think a normal bug report should be sufficient.

Recapitulation

Upon cold boot, most SRAM cells will reliably enter their preferred state
- This allows by using some post-processing (e.g. fuzzy extractors) to reliably and reproducible generate a unique fingerprint for each SRAM
Few SRAM cells have no (strongly) preferred state and their value indeed is random
- This fact is used to generate a random seed using RIOT's SRAM PUF feature
- To collect the entropy that is spread of the whole SRAM in an unknown distribution, a non-cryptographic hash function is used

The Issue

For non-cryptographic hash functions the hash value leaks information about the pre-image
For non-cryptographic hash functions, knowledge of parts of the pre-image leak information about the corresponding hash value
For non-cryptographic hash functions, related pre-images produce related hash values

If SRAM PUF would loose the "experimental" flag, this would put users of this feature at the risk of:

Being tracked and identified:
- as the random seed leaks information about the used device as to 1.
- as the pre-images of a single device are largely identical (with only few bits actually being random), seeds generated from the same SRAM are also related to each other as to 3.
Reduced entropy of the generated seed:
- for attackers knowing characteristics of the SRAM (or parts of the pre-image of the hash) of a device using the seed (e.g. which could be collected via 1.), this knowledge would leak information about the generated seed as to 2.
- due to 3. seeds generated from the same SRAM will be related, so collecting seeds can allow an adversary to identify the relation between seeds and gain information about future seeds generated from the same SRAM

Further Suggested Improvements

Keep the hash value generated by applying a cryptographic hash function H to the uninitialized SRAM secret as k
Add a counter of soft reboots as n
Use H(k ⊕ n) as seed provided to the user
- A new different seed will be provided on each boot
- As reconstructing the pre-image from the hash value is infeasible for a cryptographic hash function H, as an adversary not being able to access k cannot generate k from the seed (hash value) even when n is known
- Due to the avalanche effect of H it will be practically impossible for an adversary to detect if a seed was generated after a cold boot or a warm boot
- This will solve the issue in https://github.com/RIOT-OS/RIOT/pull/12166

Update: Improved wording to make sentence understandable

PeterKietzmann commented 4 years ago

The SRAM PUF module in its current state is (and was not supposed to be) cryptographically secure, that is true. The DEK hash function was chosen as a lightweight entropy accumulator. Not only the use of a non-cryptographic hash function but also the default seed length within auto_init prevent from guessing outputs in computational time. Finally, most PRNGs in RIOT are not cryptographically secure.
I implemented an other seeder back then, based on the SHA-1 hash but I never published it because there is more work to be done at the entropy and PRNG seeding level which I plan to tackle soon.
The proposed solution for soft-reset detection seems very reasonable.
I don't understand how the proposal solves #12166.

maribu commented 4 years ago

I don't understand how the proposal solves #12166.

As far as I understood, https://github.com/RIOT-OS/RIOT/pull/12166 was not merged because PUF SRAM will not generate a new seed for each reboot.

The SRAM PUF module in its current state is (and was not supposed to be) cryptographically secure

To me, this sentence would be something worth adding to the doc.

I implemented an other seeder back then, based on the SHA-1 hash but I never published it because there is more work to be done at the entropy and PRNG seeding level which I plan to tackle soon.

Cool! Looking forward to it. (Maybe BLAKE2s would be better than SHA-1, it should be comparable performance-wise but is superior security-wise.)

PeterKietzmann commented 4 years ago

As far as I understood, #12166 was not merged because PUF SRAM will not generate a new seed for each reboot.

I didn't follow the whole conversation and assumed it was only about build dependencies. (i) A cold reboot is always needed in order to generate entropy, (ii) it is correct that in the current state, the old seed is simply reused after soft reset detection. I agree that this is not very smart and a counter based approach makes sense.

To me, this sentence would be something worth adding to the doc.

I wouldn't mind it but the experimental state as well as the compile-time warning seem to work too. An indication what is "secure" and what not is missing for several modules and I hope to clarify this soon.

BLAKE2s would be better than SHA-1

Well, I took what was available...

fjmolinas commented 4 years ago

As far as I understood, #12166 was not merged because PUF SRAM will not generate a new seed for each reboot.

I didn't follow the whole conversation and assumed it was only about build dependencies. (i) A cold reboot is always needed in order to generate entropy, (ii) it is correct that in the current state, the old seed is simply reused after soft reset detection. I agree that this is not very smart and a counter based approach makes sense.

The first statement is indeed why I did not push #12166 forward for now, I was waiting for time to work on a solution. The dependecy could be worked out (I had proposed something). If the first issue can be worked out, and here @maribu proposed solution makes sense to me as well. I had before looked into using lowpower modes for LPM in samd0 have RAM retention (I was looking at that CPU).

maribu commented 4 years ago

I wouldn't mind it but the experimental state as well as the compile-time warning seem to work too. An indication what is "secure" and what not is missing for several modules and I hope to clarify this soon.

We need also keep in mind that using the current SRAM PUF implementation for non-cryptographic use cases also has surprising effects (at least to me). E.g. let's assume it seeds a non-cryptographic PRNG that is used in some network protocol e.g. for back offs on collisions. That way the seed could likely be reconstructed by precisely measuring the timings of network communication. And as the seed is related to the pre-image, which in turn contains the SRAM PUF fingerprint, this could potentially leak information usable to identify a device. (Likely no reliable identification solely based on this relation is possible, but in combination with other things it might very well work.)

This is an effect that is - at least to me - not that obvious. Based on the "experimental" warning - and without spending some thoughts about this - I would just assume that in worst case for that scenario the number of collisions would higher because the seeds are not uniformly distributed - but potentially leaking the identity is surprising to me.

PeterKietzmann commented 4 years ago

this could potentially leak information usable to identify a device

Yet, we do not use the fingerprint for generating IDs and furthermore, one should not use the same memory blocks for seed generation and ID building.

I would just assume that in worst case for that scenario the number of collisions would higher because the seeds are not uniformly distributed

I don't quite understand. Do you mean the case for soft reset? I think we all agree that the current implementation leaves room for improvement. Otherwise, seeds generated after hard reset actually follow a uniform distribution.

As indicated earlier, there are a some aspects of the seeding mechanism that should be changed and I'm planning to propose a solution which takes a bit more time. The reset behavior is a little outstanding and could be "fixed" independently . However, changing to a crypto-secure approach takes a bit more and I would like to separate these two aspects.

maribu commented 4 years ago

Yet, we do not use the fingerprint for generating IDs and furthermore, one should not use the same memory blocks for seed generation and ID building.

This wouldn't solve the problem I was trying to point out: PUF SRAM can be used to generate fingerprints that can uniquely identity a single device. This can be used intentionally (e.g. a CPU ID fallback implementation could be provided on this). But this can also be used against the users of a device, when they want to keep their device anonymous. And if only a part of the SRAM is used to seed an non-cryptographic PRNG, this part needs to be large enough to contain some entropy to make sense. But due to the low amount of entropy SRAM has (as it can be used to create unique fingerprints), a portion containing enough entropy likely contains enough device specific characteristics to at least help de-anonymize a device.

I would just assume that in worst case for that scenario the number of collisions would higher because the seeds are not uniformly distributed

I don't quite understand. Do you mean the case for soft reset?

I was referring to the doc stating it is experimental. When I read that a non-cryptographic PNRG seeder is experimental, I would assume that using it for something that is no viewed as security related, ni security related trap holes appear. (One could argue that a higher collision rate in example I gave eases denial of service attacks, so that it was a poor example. But every use of a non-cryptographic PRNG seeded by a non-cryptographic hash of the uninitialised SRAM riskes to help deanonymizing the devive, if seed can be reconstructed by somehow observing the PNRG output.)

So my point is: There is a non-obvious risk that is not documented in a clear way. Adding a short warning to the doc would be a good thing to me.

PeterKietzmann commented 4 years ago

a portion containing enough entropy likely contains enough device specific characteristics to at least help de-anonymize a device.

That's why the memory blocks used for entropy gathering should not be used for ID generation. The memory pattern is unpredictable and thus, even if some non-crypto PRNG reveals information about the pre-image, an anonymous ID created from an other memory block is still unpredictable.

When I read that a non-cryptographic PNRG seeder is experimental, I would assume that using it for something that is no viewed as security related, ni security related trap holes appear

I think we all agree that no (security- or other) feature should promote security related trap holes. However, the use case you're describing is still not 100% clear to me. Is it the fact that a non-cryptographic seed generated from "any" address might reveal information from a single device and in case of a future implementation of a SRAM ID generator that runs on the same device, the "secret" might be not as secret which is the trap hole?

Once again, I would always avoid using the same memory pattern for seed and ID generation.

So my point is: There is a non-obvious risk that is not documented in a clear way. Adding a short warning to the doc would be a good thing to me.

I'm not arguing against clear documentation and I can add a note but generally I think there is more to do in that field.

maribu commented 4 years ago

a portion containing enough entropy likely contains enough device specific characteristics to at least help de-anonymize a device.

That's why the memory blocks used for entropy gathering should not be used for ID generation. The memory pattern is unpredictable and thus, even if some non-crypto PRNG reveals information about the pre-image, an anonymous ID created from an other memory block is still unpredictable.

This will not solve the issue. Let's say the whole SRAM would contain fingerprint a. Now we devide the memory in two parts, the first one would contain fingerprint b and the second would contain fingerprint c. A device might use a fuzzy extractor to use fingerprint b as ID, and the second chunk to seed a non-cryptographic PRNG - thus leaking details about fingerprint c.

Let's say that device is used for a sensitive use case and a non-sensitive use case. During the sensitive interaction an adversary over time might be able reconstruct enough PRNG seeds to reconstruct fingerprint c. During the non-sensitive interaction the device might identify itself. If during that interaction the adversary is again able to extract fingerprint c, the adversary gains the knowledge that both interactions were done by the same device.

In use cases this actually matters, dividing the memory doesn't help. But using a (strong) cryptographic hash function would mitigate this. And the entropy of the hash value would be much higher (even close to the number of bits of the hash value for strong cryptographic hash functions and enough entropy in the preimage), compared to using a non-cryptographic hash function.

PeterKietzmann commented 4 years ago

During the sensitive interaction an adversary over time might be able reconstruct enough PRNG seeds to reconstruct fingerprint b

I don't see how this works out. I do agree that (partially) learning c from non-cryptographic seeds is possible (we had this discussion above) but not learning b.

maribu commented 4 years ago

Sorry, I meant c.

maribu commented 4 years ago

My point is: Even without an anonymous identifier being directly related to some canonical identifier, just being able to track a device has security implications that do matter in some contexts.

Let me give a completely unrelated example: Let's say a protestor for human rights is masking herself/himself with a mask, but an extremely unique tattoo on her/his hand is still visible. Let's also say the repressive regime she/he protested against obtains a picture of her/him during the protest. If the regime recognizes the same tattoo again they will likely just jail her/him; even if they never learn the real identity. And chances are good that during the second meeting the person has an ID card in her/his wallet, so that the real identity is leaked as well.

And back to the context of this: When used as intended, the current SRAM PUF can (using some effort and long term observation) leak something like an "Ad-ID". It is not directly connected to the real identity, but still can easily be used against you and combined with other information might just as well leak the real identity as well.

PeterKietzmann commented 4 years ago

I agree with all of what you explain but I do not see the relation to an SRAM based PUF for ID building and why we're talking about fuzzy extractors and the like. In my opinion, the problem that you describe is with any kind of identifier and it is not specific to the SRAM fingerprint. It is only the weak SRAM seeder that might leak information about a devices behavior.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

RIOT-OS / RIOT