Closed adamelhakham closed 5 years ago
@avolinski @sbutcher-arm @Patater The above issue that @adamelhakham raised is blocking Pelion Client entropy injection feature. Can someone from Device team can look on this?
Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-803
@avolinski @sbutcher-arm @Patater
Any review? Should mbedtls_hardware_poll
be protected in the application? or is this K64F issue?
Mbed OS HAL is not reentrant, IIRC. This means there is no expectation that trng_get_bytes()
would be safe from multiple threads.
The Mbed TLS library provides MBEDTLS_THREADING_C
, which can do some automatic synchronization, but it's limited. In general, performing more than one operation at a time on a single Mbed TLS context is not possible and those operations should be serialized by the application. For the parts of the library that are shared (like the entropy subsystem), indeed implementing MBEDTLS_THREADING_C
for Mbed OS would help. This isn't implemented in Mbed OS today, so until then applications should serialize access to Mbed TLS.
CC @RonEld
mbedtls_hardware_poll
is not an API that mbed TLS provides, but a system/porting API that it uses.
As such a couple of components have "piggybacked" onto this, by reasoning that "if this API is being provided by the system for mbed TLS, then we can use it too".
There's no realistic way we can ask applications to serialize access to Mbed TLS overall - there are too many independent pieces of code. So the shared components like this have to be protected somehow.
Given that there are "piggy-back" users aside from Mbed TLS itself already, I think the simplest thing to do is probably to ensure that mbedtls_hardware_poll
uses a PlatformMutex
to protect its access to trng_get_bytes
. (And if there are any other alternative versions not using DEVICE_TRNG, they should also be thread-safe).
Still, MBEDTLS_THREADING_C may be worthwhile anyway - maybe there's other stuff that needs protection. But it wouldn't currently be an answer for the general "get entropy" issue, as there is no mbed TLS API to obtain random data from its entropy generators, so code does have to go around the library straight to mbedtls_hardware_poll
.
Given that there are "piggy-back" users aside from Mbed TLS itself already, I think the simplest thing to do is probably to ensure that
mbedtls_hardware_poll
uses aPlatformMutex
to protect its access totrng_get_bytes
. (And if there are any other alternative versions not using DEVICE_TRNG, they should also be thread-safe).
I agree. the mbedtls_hardware_poll
API can be used both by mbedtls (which is used by mbed-os features such as kvstore) but can also be called directly by the application, if it needs access to the TRNG. This means that there is no way for the application to serialize application invocations of mbedtls_hardware_poll
with OS invocations of mbedtls which invoke that same function. Unless maybe if mbedtls_hardware_poll
is compiled with some weak symbol or something, and the application can provide its own implementation. But this does not seem reasonable to me..
I'll prepare a PR with this proposal.
One significant user of mbedtls_hardware_poll
occurs in the Nanostack-derived middleware - randLIB.h is a high-quality pseudo-RNG for general non-crypto use, and it can accept randomisation being pushed in from various sources (eg radio noise, or MAC address).
And by default it will use mbedtls_hardware_poll
at startup to get going. (This use predates the TRNG HAL - but we knew that if a system ever had some hardware TRNG, it must presumably be plumbed in to Mbed TLS there.)
@adamelhakham Can you test PR 9532 ? it should fix this issue
One extra point on this - I note that the "entropy context" in Mbed TLS is not itself a shared resource, so each TLSSocketWrapper
or other TLS session has its own entropy context.
That in turn means that MBEDTLS_THREADING_C
would actually provide no protection here - each entropy context is independently calling mbedtls_hardware_poll
with no overall system-level lock, just a per-context mutex.
As such, it does appear to be an undocumented porting requirement that mbedtls_hardware_poll
be thread-safe, regardless of the setting of MBEDTLS_THREADING_C
.
I feel like I'm coming late to this, but as touched on above, whilst Mbed TLS supports multiple threads in its standalone form through MBEDTLS_THREADING_C
, that is currently not supported in Mbed OS.
mbedtls_hardware_poll is not an API that mbed TLS provides, but a system/porting API that it uses.
As such a couple of components have "piggybacked" onto this, by reasoning that "if this API is being provided by the system for mbed TLS, then we can use it too".
That's correct, and it's really part of the porting layer. As such its on its own and has to provide its own resource management. Defining MBEDTLS_THREADING_C
won't help anyone.
Given that there are "piggy-back" users aside from Mbed TLS itself already, I think the simplest thing to do is probably to ensure that mbedtls_hardware_poll uses a PlatformMutex to protect its access to trng_get_bytes. (And if there are any other alternative versions not using DEVICE_TRNG, they should also be thread-safe).
I think that's the best solution.
Still,
MBEDTLS_THREADING_C
may be worthwhile anyway - maybe there's other stuff that needs protection. But it wouldn't currently be an answer for the general "get entropy" issue, as there is no mbed TLS API to obtain random data from its entropy generators, so code does have to go around the library straight to mbedtls_hardware_poll.
That involves making MBEDTLS_THREADING_C
supported in Mbed OS. We currently don't have time for that, and I don't think it's needed for this release or this bug (although it would be a good to add it in the near future).
@0xc0170 @kjbracey-arm I tested it and our CI passes. Thanks!
@0xc0170 @kjbracey-arm Will the fix be added to release 5.11.3?
@adamelhakham 5.11.3 is about to be released. https://github.com/ARMmbed/mbed-os/pull/9532 will come in during 5.11.4
Description
I am part of the group that develops the Pelion Device Management Client. We use mbed-os with KVSTORE and in our testing we do a lot of seeding of a DRBG, which calls the mbedtls
mbedtls_entropy_func
function, which calls thembedtls_hardware_poll
function to generate random entropy from the K64F hardware. In addition, the Pelion device management client runs a thread that every number of seconds tries to generate random data from the mbedtls_hardware_poll API. The problem is that thembedtls_hardware_poll
API is not thread safe and we occationally get the following hard fault:After disassembling the code, program counter points to the following instruction (the disasse:
from looking at the following file:
targets/TARGET_Freescale/TARGET_MCUXpresso_MCUS/TARGET_MCU_K64F/trng_api.c
I see that thetrng_get_byte()
API polls the TRNG status register and when it is set, the function reads from the output register. Now, it is possible for 2 threads to read that the status register has been set and then one thread will read the output register, while the next thread will try to read from the output register while the status register has been set back to 0 (after the first thread read), which according to the K64F reference manual, will raise a hardware exception. Not sure that that is in fact what is happening but is possible.Shouldn't the
trng_get_byte
function protect from that situation?Reproducing Our issue can easily be reproduced by creating two threads that simply call
mbedtls_hardware_poll()
. The dump I added was from the ARMC compiler but the same happens with GCC ARM.Issue request type