Open gilles-peskine-arm opened 2 years ago
We should also prioritise using platform-specific zeroise functions, as done in #1760, as they come with more of a guarantee that the compiler won't elide them for optimisation (RtlSecureZeroMemory()
on Windows, explicit_bzero()
on BSD, and memset_s()
where available)
We haven't prioritized platform-specific functions because many platforms either don't have them, or ship an mbedtls config with MBEDTLS_PLATFORM_ZEROIZE_ALT
set appropriately. Annoyingly, it's difficult to detect the few platforms where memset_s
is available, and I don't think we have any on our CI.
I think the current strategy of hiding the fact that we're calling memset
works pretty well. It's far from ideal though, in particular because since the compiler doesn't know what's going on, it has to write the previous content back to memory (bad both for security and for performance), which can have a non-negligible effect when we want to zeroize a local variable that would otherwise be kept in a register.
Honestly the only way to implement zeroization properly is a compiler intrinsic that (1) writes to the data and (2) adds a fake use of the data (which is removed after optimization, but before register allocation).
the only way to implement zeroization properly is a compiler intrinsic
Indeed, unfortunately there's no way to do this in standard C. I have however been thinking of changing our current strategy of memset-in-disguise to an ordinary memset followed by a no-op-in-disguise. But changing this takes time to study the effect with various compilers/optimization settings/platforms/contexts, and of course it's better if we have effective testing first.
We test that
mbedtls_platform_zeroize
works by compiling a simple program that calls this function followed byexit
, setting a breakpoint between the two, and checking that the memory has been wiped at that point. This is done by theall.sh
componentcomponent_test_zeroize
, in several builds, with GCC and Clang with various optimization flags.This testing is completely ineffective because
mbedtls_platform_zeroize
is a function defined in a separate module. A compiler can't optimize it away when doing separate compilation: when the function is compiled, it must preserve its general semantics since it doesn't know how the function will be used; and where the function is used, the compiler must generate a call since it doesn't know what the function does.There is a risk that the zeroization is optimized away only when not compiling the source code modules separately. But we don't test that.
We should test with link-time optimization (
-flto
). There's no point in testing without.(Also there are other risks, in particular we're only validating that it works in one particular context, and of course we're only testing with particular compilers. But those are limitations inherent to the very principle of testing. What's wrong with what we're doing now is that it's guaranteed not to actually test anything.)
I confirm that if I build with
clang -O3 -flto
, the test passes with the current code, but fails if I remove thevolatile
annotation onmemset_func
inplatform_util.c
.