espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.38k stars 7.22k forks source link

xtensa-esp32-elf-gcc gives bad code despite -mfix-esp32-psram-cache-issue (IDFGH-9766) #11100

Open aksdfauytv opened 1 year ago

aksdfauytv commented 1 year ago

gcc version: (crosstool-NG esp-2021r2) 8.4.0 IDF: v4.3.2 build with heap poisoning enabled. compiling with flags: -mlongcalls -mfix-esp32-psram-cache-issue -mfix-esp32-psram-cache-strategy=memw -ggdb -O2 -std=gnu99 -ffunction-sections -fdata-sections -fstrict-volatile-bitfields -fstack-protector chip: ESP32-PICO-D4 (revision 1)

PlatformAllocateChunk is a wrapper for heap_caps_malloc with MALLOC_CAP_SPIRAM argument.

This source:

        self->currentPackage = (LogWriterPackage*) PlatformAllocateChunk(sizeof(LogWriterPackage));
        if(self->currentPackage == NULL) {
            PlatformLogError(LOGTAG, "Allocating package failed");
            return;
        }

        self->currentPackage->event.json = PlatformAllocateChunk(MAX(PACKAGE_MAX_SIZE, (input->length + 2)));
        char *tmp = self->currentPackage->event.json;
        __asm volatile("" ::: "a5"); // prevent tmp from being placed in a5, so we can read the value of a5 and make sure it was not changed by some other means. If you remove this line, the bug still happens, you just can't see where a10 was written to (see disassembly).
        if((int)tmp < 0) // check if tmp is 0xcececece
            __builtin_trap();

produces this code:

// a2 contains 'self'
// 'currentPackage' is at offset 92 in 'self'
// 'json' is at offset 4 in 'LogWriterPackage'
4012bef2:       0d1465          call8   40139038 <PlatformAllocateChunk>
4012bef5:       1762a2          s32i    a10, a2, 92
4012bef8:       0a5d            mov.n   a5, a10 // keep 'self->currentPackage' for later in a5
4012befa:       01fa56          bnez    a10, 4012bf1d <LogWriterAddEvent$isra$0+0x7d>

[...] // not interesting

4012bf1d:

[...] // not interesting, calculate the size argument for PlatformAllocateChunk

4012bf29:       0d10e5          call8   40139038 <PlatformAllocateChunk>
4012bf2c:       172262          l32i    a6, a2, 92 // get 'self->currentPackage', even though we already have it cached in a5
4012bf2f:       0165a2          s32i    a10, a5, 4 // set 'event.json' (using a5)
4012bf32:       1678            l32i.n  a7, a6, 4 // read 'event.json' (using a6)
4012bf34:       0027d6          bgez    a7, 4012bf3a <LogWriterAddEvent$isra$0+0x9a> // check, if the interrupt screwed up bus access
4012bf37:       0041f0          break   1, 15 // examine the regsiters
4012bf3a:

Notice that a5 and a6 both point to the same address, so s32i a10, a5, 4 and l32i.n a7, a6, 4 are violating errata issue 3.9. The result is that a10 is not written into memory. a7 contains the original value of memory, which was poisoned to 0xcececece by malloc.

A5      : 0x3f909a20
A6      : 0x3f909a20
A7      : 0xcececece
A10     : 0x3f90d604

So, my guess would be that the bug is probably cased by fix-esp32-psram-cache-issue workaround not noticing gcc used two registers to access the same address.

zinke-ct-video commented 1 year ago

I am using ESP32 Rev. 1 and I have now found a piece of code that can reliably reproduce an inexplainable wrong byte in external SPIRAM. To confirm my assumption I executed the exact same bytecode on an ESP32 Rev. 3.0 and the error disappeared.

So I can confirm that ESP-IDF commit dc016f59877d13e6e7d4fc193aa5aa764547f16d definitely contains a severe bug regarding the workarounds for errata 3.9 and / or 3.10.

My compiler version is: xtensa-esp32-elf-gcc-12.2.0.exe (crosstool-NG esp-12.2.0_20230208) 12.2.0

@igrr Could you tell us please: Does espressif intend to fix this bug or does espressif no longer support the ESP32 Rev. 1 ?

igrr commented 1 year ago

@zinke-ct-video if you have a reproducer code you could share, we will investigate the issue and try to find what causes the incorrect code to be generated.

(The code provided by the original issue reporter didn't help much, unfortunately, because it can't be compiled without the rest of the application.)

aksdfauytv commented 1 year ago

Hi @igrr, you should have said you need a better example. Here is a full code to reproduce the issue:

build with xtensa-esp32-elf-gcc test.c -c -mlongcalls -mfix-esp32-psram-cache-issue -mfix-esp32-psram-cache-strategy=memw -ggdb -O2 -std=gnu99 -ffunction-sections -fdata-sections -fstrict-volatile-bitfields -fstack-protector look for break 1, 15 in the disassembly, the faulty code is just before it

typedef struct
{
    int length;
    char *json;
} JSONEvent;

typedef struct
{
    JSONEvent event;
    unsigned short eventsCount;
    unsigned long long packageStartTime;
} LogWriterPackage;
typedef struct
{
    char pad[92];
    LogWriterPackage *currentPackage;
} self_t;
typedef struct
{
    int length;
} input_t;

void PlatformLogError(const char *fmt, ...);
void *PlatformAllocateChunk(int size);
int LogWriterAddPackage(self_t* self, LogWriterPackage* package);
int esp_get_free_heap_size();
unsigned long long PlatformGetRelativeTimeMS();
void free(void *p);
#define LOGTAG "bla"
#define MIN_FREE_HEAP 1048576
#define LOGWRITER_MAX_ELEMENTS 100
#define PACKAGE_MAX_SIZE 10000
#define PACKAGE_MAX_AGE 1000
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define NULL 0

void bla(self_t *self, input_t *input, const JSONEvent* event)
{
    if(self->currentPackage == NULL || (self->currentPackage->event.length + (event->length + 2)) >= PACKAGE_MAX_SIZE)
    {
        if(self->currentPackage != NULL) {
            LogWriterAddPackage(self, self->currentPackage);
            self->currentPackage = NULL;
        }

        if(esp_get_free_heap_size() < MIN_FREE_HEAP) {
                PlatformLogError(LOGTAG, "Not enough memory to allocate new package");
                return;
        }

        //@TODO: Check free heap space and break loop
        self->currentPackage = (LogWriterPackage*) PlatformAllocateChunk(sizeof(LogWriterPackage));
        if(self->currentPackage == NULL) {
            PlatformLogError(LOGTAG, "Allocating package failed");
            return;
        }

        self->currentPackage->event.json = PlatformAllocateChunk(MAX(PACKAGE_MAX_SIZE, (event->length + 2)));
        char *tmp = self->currentPackage->event.json;
        //__asm volatile("" ::: "a5"); // prevent tmp from being placed in a5, so we can read the value of a5 and make sure it was not changed by some other means. If you remove this line, the bug still happens, you just can't see where a10 was written to (see disassembly).
        if((int)tmp < 0) // check if tmp is 0xcececece
            __builtin_trap();
        if(tmp == NULL) {
            PlatformLogError(LOGTAG, "Allocating package data failed");
            free(self->currentPackage);
            self->currentPackage = NULL;
            return;
        }

        self->currentPackage->eventsCount = 0;
        self->currentPackage->event.length = 0;
        self->currentPackage->packageStartTime = PlatformGetRelativeTimeMS();
    }
}
zinke-ct-video commented 1 year ago

I tried to put my error reproducing code into a separate folder to share it with you. Unfortunately it is enough to change the paths to make the error disappear. Comparing the compiled binary code I can see that the folder paths contained in the binary have changed and I assume that this change causes the error not to appear.

@igrr Do you have any idea how I could make my project "portable" so I can share it with you for investigation?

igrr commented 1 year ago

You can try to enable the reproducible builds option in menuconfig (see https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/reproducible-builds.html#enabling-reproducible-builds-in-esp-idf), this should exclude all paths which may be added by IDF code. However if you use __FILE__ directly in your code, it will still introduce some dependency on the binary.

zinke-ct-video commented 1 year ago

Very good hint. This has helped. So here is my example project. This is the minimum code for which I could make sure that the error appears: example.zip

For faster orientation in my code:

Additional hints:

Log Output:

I (30) boot: ESP-IDF HEAD-HASH-NOTFOUND 2nd stage bootloader
I (30) boot: Multicore bootloader
I (31) boot: chip revision: v1.0
I (31) qio_mode: Enabling default flash chip QIO
I (31) boot.esp32: SPI Speed      : 80MHz
I (31) boot.esp32: SPI Mode       : QIO
I (31) boot.esp32: SPI Flash Size : 16MB
I (32) boot: Enabling RNG early entropy source...
I (32) boot: Partition Table:
I (32) boot: ## Label            Usage          Type ST Offset   Length
I (33) boot:  0 nvs              WiFi data        01 02 00009000 00004000
I (33) boot:  1 otadata          OTA data         01 00 0000d000 00002000
I (33) boot:  2 phy_init         RF data          01 01 0000f000 00001000
I (34) boot:  3 ota_0            OTA app          00 10 00010000 00400000
I (34) boot:  4 ota_1            OTA app          00 11 00410000 00400000
I (35) boot: End of partition table
I (35) boot: No factory image, trying OTA 0
I (35) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=184724h (1591076) map
I (473) esp_image: segment 1: paddr=0019474c vaddr=3ffb0000 size=03bd8h ( 15320) load
I (478) esp_image: segment 2: paddr=0019832c vaddr=40080000 size=07cech ( 31980) load
I (489) esp_image: segment 3: paddr=001a0020 vaddr=400d0020 size=1c804h (116740) map
I (521) esp_image: segment 4: paddr=001bc82c vaddr=40087cec size=08a48h ( 35400) load
I (541) boot: Loaded app from partition at offset 0x10000
I (560) boot: Set actual ota_seq=1 in otadata[0]
I (560) boot: Disabling RNG early entropy source...
I (560) cpu_start: Multicore app
I (561) quad_psram: This chip is ESP32-D0WD
I (561) esp_psram: Found 8MB PSRAM device
I (562) esp_psram: Speed: 80MHz
I (562) esp_psram: PSRAM initialized, cache is in low/high (2-core) mode.
W (562) esp_psram: Virtual address not enough for PSRAM, map as much as we can. 4MB is mapped
I (562) cpu_start: Pro cpu up.
I (562) cpu_start: Starting app cpu, entry point is 0x40081394
0x40081394: call_start_cpu1 at /COMPONENT_ESP_SYSTEM_DIR/port/cpu_start.c:154

I (0) cpu_start: App cpu up.
I (1074) esp_psram: SPI SRAM memory test OK
I (1082) cpu_start: Pro cpu start user code
I (1082) cpu_start: cpu freq: 240000000 Hz
I (1082) cpu_start: Application information:
I (1082) cpu_start: Project name:     CT-Webinterface
I (1082) cpu_start: App version:      3.4.2
I (1083) cpu_start: ELF file SHA256:  79a82c3216d78d95...
I (1083) cpu_start: ESP-IDF:          HEAD-HASH-NOTFOUND
I (1083) cpu_start: Min chip rev:     v1.0
I (1084) cpu_start: Max chip rev:     v3.99
I (1084) cpu_start: Chip rev:         v1.0
I (1084) heap_init: Initializing. RAM available for dynamic allocation:
I (1085) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (1085) heap_init: At 3FFB44D0 len 0002BB30 (174 KiB): DRAM
I (1085) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (1086) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (1087) heap_init: At 40090734 len 0000F8CC (62 KiB): IRAM
I (1370) esp_psram: Adding pool of 4096K of PSRAM memory to heap allocator
I (1371) spi_flash: detected chip: gd
I (1371) spi_flash: flash io: qio
I (1372) app_start: Starting scheduler on CPU0
I (1373) app_start: Starting scheduler on CPU1
I (1373) main_task: Started on CPU0
I (1383) esp_psram: Reserving pool of 24K of internal memory for DMA/internal allocations
I (1383) main_task: Calling app_main()
I (1383) APP: Reset reason: 1
E (1383) Filesystem / Factory Defaults: Restoring factory default content on SD card
I (1383) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f800888 (size 32768) in archive unpacker task:
I (1413) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f80889c (size 32768) in archive unpacker task:
I (1443) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f8108b0 (size 32768) in archive unpacker task:
I (1463) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f8188c4 (size 32768) in archive unpacker task:
I (1493) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f8208d8 (size 32768) in archive unpacker task:
I (1503) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f8288ec (size 32768) in archive unpacker task:
I (1513) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f830900 (size 32768) in archive unpacker task:
I (1533) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f838914 (size 32768) in archive unpacker task:
I (1543) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f80165c (size 32768) in archive unpacker task:
I (1553) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f840928 (size 32768) in archive unpacker task:
I (1563) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f809670 (size 32768) in archive unpacker task:
I (1573) Filesystem / Factory Defaults: Processing decompressed buffer 0x3f84893c (size 32768) in archive unpacker task:
E (1583) Filesystem / Factory Defaults: Canary bytes of buffer 0x3f822ad4 (size 3535) defective:
E (1583) Filesystem / Factory Defaults: 0x3ffc2418   00 56 ad ba                                       |.V..|

assert failed: archive_entry_unpacking_content deploy_factory_default_image.c:229 (check_canary_bytes_of_buffer(context->js_file_buffer, context->js_file_buffer_size))

Backtrace: 0x40081bc1:0x3ffc22d0 0x40089669:0x3ffc22f0 0x4008f0fe:0x3ffc2310 0x400d9731:0x3ffc2440 0x400da228:0x3ffc2460 0x400da5bb:0x3ffc2490 0x400d8ff9:0x3ffc24c0 0x4008c499:0x3ffc2740
0x40081bc1: panic_abort at /COMPONENT_ESP_SYSTEM_DIR/panic.c:452

0x40089669: esp_system_abort at /COMPONENT_ESP_SYSTEM_DIR/port/esp_system_chip.c:90

0x4008f0fe: __assert_func at /COMPONENT_NEWLIB_DIR/assert.c:81

0x400d9731: archive_entry_unpacking_content at /COMPONENT_MAIN_DIR/filesystem/deploy_factory_default_image.c:229 (discriminator 1)

0x400da228: archive_unpacker_step_fsm at /COMPONENT_MAIN_DIR/filesystem/archive.c:172

0x400da5bb: archive_unpacker_feed at /COMPONENT_MAIN_DIR/filesystem/archive.c:41 (discriminator 2)

0x400d8ff9: archive_unpacker_task_code at /COMPONENT_MAIN_DIR/filesystem/deploy_factory_default_image.c:141 (discriminator 13)

0x4008c499: vPortTaskWrapper at /COMPONENT_FREERTOS_DIR/FreeRTOS-Kernel/portable/xtensa/port.c:162

If you need any additional info, please contact me, I will do my best to help finding this bug.