espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.78k stars 7.31k forks source link

[Feature Request] No way to tell if esp_random_fill() is generating true random numbers (IDFGH-7117) #8725

Open chipweinberger opened 2 years ago

chipweinberger commented 2 years ago

For the ESP32-S3, the technical docs says a lot about how random numbers work in the MCU.

21.4 Programming Procedure

When using the random number generator, make sure at least either the SAR ADC, high-speed ADC, or RTC20M_CLK is enabled. Otherwise, pseudo-random numbers will be returned.

• SAR ADC can be enabled by using the DIG ADC controller. For details, please refer to Chapter 5 On-Chip Sensors and Analog Signal Processing [to be added later].

• High-speed ADC is enabled automatically when the Wi-Fi or Bluetooth modules is enabled.

• RTC20M_CLK is enabled by setting the RTC_CNTL_DIG_CLK20M_EN bit in the RTC_CNTL_CLK_CONF_REG register.

However, the api for esp_random_fill() does not reflect this at all. How can we know what mode esp_random_fill() is in?

I did a simple test:

// called on PRO-cpu
void app_main() {

    uint8_t bufr[16];
    esp_fill_random(bufr, 16);
    ESP_LOG_BUFFER_HEX("rrrr", bufr, 16);
}

And was surprised to find that is was always different results:

I (1554) rrrr: 6c c5 95 3f b1 36 77 2f a5 99 fb 2d bc 19 02 f2 // 1st boot I (1554) rrrr: a4 5f 71 c4 b3 44 48 c9 6a ea 4d 94 91 0b 3b 88 // 2nd boot I (1554) rrrr: 26 89 1b 81 3a df 5b 3a 6b 5d 6a e9 d4 2b 54 79 // 3rd boot I (1554) rrrr: a4 bc 7b c1 c1 3b d1 e3 02 6a ec 30 16 84 3a b3 // 4th boot

  1. Are these true random numbers? Even without wifi enabled?
  2. If I call esp_random() too fast, what happens? Ideally it will block and still give me true random numbers
  3. which esp_wifi() call enables the high freq RNG? esp_wifi_start()? or esp_wifi_init()?

The docs and API should clarify this ambiguity.

Suggestions, if possible:

  1. implement esp_random_true(). returns error if it does not have a source of true random numbers. always blocks if called too fast.
  2. implement esp_random_source(). returns the current source of random numbers. some enum.
chipweinberger commented 2 years ago

The code for esp_random() has some more useful comments. But it doesn't seem to reflect SAR, vs High-speed ADC, vs RTC20M_CLK. Not sure how it is working.

uint32_t IRAM_ATTR esp_random(void)
{
    /* The PRNG which implements WDEV_RANDOM register gets 2 bits
     * of extra entropy from a hardware randomness source every APB clock cycle
     * (provided WiFi or BT are enabled). To make sure entropy is not drained
     * faster than it is added, this function needs to wait for at least 16 APB
     * clock cycles after reading previous word. This implementation may actually
     * wait a bit longer due to extra time spent in arithmetic and branch statements.
     *
     * As a (probably unncessary) precaution to avoid returning the
     * RNG state as-is, the result is XORed with additional
     * WDEV_RND_REG reads while waiting.
     */

    /* This code does not run in a critical section, so CPU frequency switch may
     * happens while this code runs (this will not happen in the current
     * implementation, but possible in the future). However if that happens,
     * the number of cycles spent on frequency switching will certainly be more
     * than the number of cycles we need to wait here.
     */
    uint32_t cpu_to_apb_freq_ratio = esp_clk_cpu_freq() / esp_clk_apb_freq();

    static uint32_t last_ccount = 0;
    uint32_t ccount;
    uint32_t result = 0;
    do {
        ccount = cpu_hal_get_cycle_count();
        result ^= REG_READ(WDEV_RND_REG);
    } while (ccount - last_ccount < cpu_to_apb_freq_ratio * APB_CYCLE_WAIT_NUM);
    last_ccount = ccount;
    return result ^ REG_READ(WDEV_RND_REG);
}
0xjakob commented 2 years ago

Hi @chipweinberger !

Please first note that APB_CYCLE_WAIT_NUM is defined just above esp_random():

#if defined CONFIG_IDF_TARGET_ESP32S3
#define APB_CYCLE_WAIT_NUM (1778) /* If APB clock is 80 MHz, maximum sampling frequency is around 45 KHz*/
                                  /* 45 KHz reading frequency is the maximum we have tested so far on S3 */
#else
#define APB_CYCLE_WAIT_NUM (16)
#endif

Hence, the number is adjusted for S3 regarding our current test-results.

General Notes

The easiest way to yield good entropy is to enable WiFi before. If you don't want to do that, there is a way via the SAR ADC, but it is not safe to use the ADC itself at the same time, since the ADC needs to be configured in a certain way. If another API that uses the ADC is called at the same time, it will likely configure the ADC in a different way, hence creating conflicts.

You can use the bootloader_random_enable() API from the (obviously) bootloader. Again, if you don't use the ADC in another way during that time.

We have tested the RNG under several conditions, among them with WiFi enabled as well as WiFi disabled and only the SAR ADC configured via bootloader_random_enable(). In both cases, we have been able to verify with a high probability that the numbers are true random. We used the dieharder test suite. We made sure that any tests there with suspicious results would pass when re-run using a bigger sample size. The frequency of 32-bit random number acquisition was between 45 and 50 KHz, hence the 45 KHz limit in the code above.

Regarding the Feature Request

I also think it is necessary to have a configuration check function which at least checks that the RNG and the entropy sources are correctly configured. We will consider this and probably implement it in one way or another.

Answers

  1. Are these true random numbers? Even without wifi enabled?

Depends, see the comments in the general notes.

  1. If I call esp_random() too fast, what happens? Ideally it will block and still give me true random numbers

Since it will wait internally, it can't be called too fast. I.e., it will always wait for a safe amount of time.

  1. which esp_wifi() call enables the high freq RNG? esp_wifi_start()? or esp_wifi_init()?

Both together (until further notice). We have tested by setting up a fake station before querying random numbers. The WiFi init code for testing looks roughly like this:

        wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
        ESP_ERROR_CHECK(esp_wifi_init(&cfg));

            wifi_config_t wifi_config = {
            .sta = {
                .ssid = "idontexist",
                .password = "doesnotneedtobereal",
            },
        };
        ESP_ERROR_CHECK(esp_wifi_set_mode(WIFI_MODE_STA) );
        ESP_ERROR_CHECK(esp_wifi_set_config(ESP_IF_WIFI_STA, &wifi_config) );
        ESP_ERROR_CHECK(esp_wifi_start() );
chipweinberger commented 2 years ago

Many thanks for the detailed response. I'm always impressed by Espressif engineers and their support.

The easiest way to yield good entropy is to enable WiFi before. If you don't want to do that, there is a way via the SAR ADC, but it is not safe to use the ADC itself at the same time

Yes, it would simplify my code to not enable Wifi. I just need to create few random numbers on first boot, not many.

You can use the bootloader_random_enable() API] from the (obviously) bootloader. Again, if you don't use the ADC in another way during that time.

From the docs, it looks like bootloader_random_disable() is called before my app executes. And you can see "Disabling RNG early entropy source..." found in the bootloader logs. Is the only way to leave it enabled to modify the main bootloader_support component (not bootloader component)? Is there a way to enable it after my app starts? I only need it briefly.

WiFi init code for testing looks roughly like this:

Thanks. This is a useful reference.

Are these true random numbers? Even without wifi enabled? Depends, see the comments in the general notes.

In my simple test code (see first comment), I do not enable wifi, and do not enable the SAR ADC using bootloader_random_enable(). So I do not understand why my simple example code got different numbers after each boot. Do you know why? Ahh, maybe because the "RNG early entropy source" was enabled temporarily during the bootloader? I just need ~200 bits of randomness to generate a UUID so maybe I can get away with directly using esp_random() right after boot?

0xjakob commented 2 years ago

Is there a way to enable it after my app starts

Just call bootloader_random_enable() again, it's available from user code, too, via the bootloader_support component.

I do not understand why my simple example code got different numbers after each boot.

The RNG and its entropy sources operate separately. The RNG itself begins to run as soon as it's enabled, independently of the entropy sources. This will be reflected in a state change every APB (usually 80MHz) clock cycle. These numbers are not truly random, but reading the RNG_DATA_REG (WDEV_RND_REG in the code above), taps into this constantly evolving state. Just by varying the access time a few clock cycles, the result will change already. These numbers look random at first sight, but will horribly fail any thorough testing for randomness. In fact, we did measure basically this case and some simple histogram test failed (unlike using the RNG with the entropy source enabled) easily, and dieharder failed in most of its tests. The latter test was for us to confirm a failure on known-bad data. The tests with the entropy sources correctly enabled did pass the tests.

This is the reason why I also believe a configuration check function is a really, really good idea.

chipweinberger commented 2 years ago

Just by varying the access time a few clock cycles, the result will change already.

I see.

we did measure basically this case and some simple histogram test failed

Yes, that's concerning!

Just call bootloader_random_enable() again

Simple enough! I'll do this! I see the docs actually mentioned this!

void bootloader_random_enable(void) "Can also be used from app code early during operation" - docs

but I thought that meant that the random numbers (from say, rand()), can be used even after the app code starts. I didn't realize it meant that the function can still be called outside of the bootloader. Perhaps it could be clarified to say "This function can be called from app code too. It does not need to be called from the bootloader."

This is the reason why I also believe a configuration check function is a really, really good idea.

yes please =)

Thanks for all your help!

chipweinberger commented 2 years ago

Bump. @0xjakob

"Wont do". I assume we should close this?

0xjakob commented 2 years ago

@chipweinberger The ticket is still open. I changed it internally and then changed it back, apparently that lead to the confusion. Sorry about that.

newpavlov commented 9 months ago

We use esp_random_fill in the getrandom library. Its goal is to be a cross-platform interface for retrieving high-quality (i.e. suitable for cryptographic purposes) entropy from (operating) system sources. IIUC depending on platform configuration esp_random_fill may provide low quality randomness and according to the code in this comment you do not use any whitening, correct? This makes esp_random_fill completely unsuitable for our purposes. At the very least we would like to get an error code if platform was not configured in a way which allows generation of good entropy. Also, ideally, it would be better if esp_random_fill was backed by a proper CSPRNG seeded from hardware sources (e.g. you could use k12).

Could you please clarify exact guarantees provided by esp_random_fill? Maybe it's worth to introduce a separate function (e.g. esp_crypto_random_fill) with stricter guarantees?

newpavlov commented 7 months ago

It would be nice to get some replies to my previous comment. As the things stand, we may resort to removing ESP IDF support from the getrandom crate, since falling back to an insecure pseudorandom generator is not an acceptable behavior for it.

0xjakob commented 7 months ago

@newpavlov Apologies for the delay answering!

The random number generator works well only if a proper entropy source is enabled, which can be the ADC from the RF subsystem (high-speed ADC) or the "normal" ADC (SAR ADC). The latter can be activated by calling bootloader_random_enable(), if the RF module (WiFi, Bluetooth) can't or shouldn't be enabled.

Note that esp_random_fill() can be used to seed a software CSPRNG, which is done in mbedtls, as far as I know. I.e., esp_random_fill() is similar to the libc getentropy() function.

esp_random_fill() currently returns nothing and I think we can't simply change it as this would likely be a breaking change. Probably adding an additional function, as you suggested, would be better.

0xjakob commented 7 months ago

Regarding the guarantees of esp_random_fill(): It guarantees not calling the hardware random number generator too often. It does not guarantee that any entropy source is enabled properly. This needs to be done by the user by enabling any of the ADC (see above). This is also explained in the Random number generator documentation.

newpavlov commented 7 months ago

Thank you for your reply!

Assuming that a peripheral entropy source is enabled, what about entropy quality produced by esp_random_fill()? The reference manual PDF is not sufficiently clear about it. It mentions that generated random numbers pass statistical tests under certain conditions, but it also recommends to limit reading from RNG_DATA_REG to 500 kHz. What exactly happens in the "Random Number Generator" block? Does it perform proper whitening of consumed entropy? In some cases getrandom is used directly to generate cryptographic keys, so it's important to ensure quality of generated entropy.

I guess for now we should simply warn users in the getrandom docs and wait for introduction of the new function.