SPI master reads overrun non 32bit aligned buffers (IDFGH-6714)

bill1davis commented 2 years ago

Development Kit: N/A custom PCB
Kit version (for WroverKit/PicoKit/DevKitC): [v1|v2|v3|v4] : N/A
Module or chip used: [ESP32-WROVER]
IDF version: v4.3.1
Build System: [idf.py]
Compiler version : xtensa-esp32-elf-gcc (crosstool-NG esp-2021r1) 8.4.0
Operating System: [UBUNTU 20.04LTS]
Using an IDE?: [No]
Power Supply: [external 3.3V w/Battery]

Problem Description

SPI reads were causing heap poisoning even when reading to buffers allocated using HEAP_CAPS_DMA. Instrumenting the problem I found zeroes written past the end of buffers up to the next 32 bit boundary. In other words a 9 byte transaction with a 9 byte tx and rx length would copy 12 bytes into the destination buffer.

Test Code

This is an excerpt from our usb chip (max3421) driver. It also happens with our external flash part, so I don't think the type of device you read from matters.

` // Here is the interface config HSPI, 22Mhz, hw chip select

spi_device_interface_config_t sp_cfg={
    .clock_speed_hz=22*1000*1000,        // Max3421 supports up to 26MHz
    .mode=0,                             //SPI mode 0
    .spics_io_num= MAX3421_SSN,          //CS pin
    .queue_size=1,                       //We want to be able to queue 1 transactions at a time
};

// This is our read function. int32_t max3421_rd_bytes_alt(uint8_t reg, uint8_t bytes, uint8_t cnt, uint8_t status) { spi_transaction_t t;

// first byte of transaction is register
cnt++;

// need buffer to be word aligned longer than read, or heap will be poisoned uint8_t cmd = (uint8_t)heap_caps_malloc(cnt + 4, MALLOC_CAP_DMA); if ( !cmd ) { return ESP_ERR_NO_MEM; } uint8_t rsp = (uint8_t)heap_caps_malloc(cnt + 4, MALLOC_CAP_DMA); if ( !rsp ) { free(cmd); return ESP_ERR_NO_MEM; }

// address with direction bit cleared (0 = read)
memset(cmd, 0, cnt);
cmd[0] = reg & ~0x02;

memset(&t, 0, sizeof(t));

t.length=cnt*8;
t.rxlength = t.length;
t.tx_buffer=cmd;
t.rx_buffer=rsp;
memset(rsp, 0xA5, cnt+4);

int32_t rval = spi_device_transmit(max3421.spi, t);

if (rsp[cnt] != 0xA5) {
    ESP_LOGE(TAG, "Rx Buffer Overrun cnt=%d, rsp[cnt] = 0x%02X %02X %02X %02X", cnt, rsp[cnt], rsp[cnt+1], rsp[cnt+2], rsp[cnt+3]);
}

// copy bytes except first, first byte is status,
if ( status ) {
    *status = rsp[0];
}
memcpy(bytes, rsp+1, cnt-1);
free(cmd);
free(rsp);

return rval;

} `

Expected Result

SPI driver reads cnt bytes into buffer or prints error that rx buffer needs to be word aligned, or heap_caps_malloc of DMA memory returns automatically word aligned buffers

Actual Result

For reads of size 9 bytes, the following prints out of our test code. E (24087) max3421: Rx Buffer Overrun cnt=9, rsp[cnt] = 0x00 00 00 A5

Steps to reproduce

Read a non word aligned number of bytes from a spi slave device using a deliberately longer buffer filled with a known pattern. Check the bytes immediately past the read length for the pattern.

ginkgm commented 2 years ago

Hi @bill1davis ,

This is expected behavior, see: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/spi_master.html#write-and-read-phases

But maybe we should add checking or warning before starting transaction?...

bill1davis commented 2 years ago

Hi @ginkgm,

Thanks for your reply.

Don't you think the following text implies that if you don't use a 32 bit aligned buffer, it will still work, but go slower.

"If these requirements are not satisfied, the transaction efficiency will be affected due to the allocation and copying of temporary buffers.". That was my take anyway. In any case, warnings in the doc of the transactions themselves would be helpful.

ginkgm commented 2 years ago

em... it seems that the code behavior is slightly different from the documentation. The util for unaligned buffer only helps to allocate temp buffer when buffer address is not aligned, but does not take length into consideration. See code here: https://github.com/espressif/esp-idf/blob/master/components/driver/spi_master.c#L747

The DMA always write in words, so I didn't expect you will still need the <3 bytes after your receive data. As the rxlength sometimes means the transaction length, I think it should be allowed to be non-32-bit aligned. It means that, under that case, specifying rxlength %4 != 0 imples the remaining <3 bytes will be overwritten.

We will update the documentation about this. But same reason as the util above, this check may happen frequently, so there will be no runtime warning about this.

bill1davis commented 2 years ago

@ginkgm Sounds good. I agree that a runtime check and warning is a lot of overhead, although I think having malloc_cap_dma automatically allocate 32bit aligned buffers might be a good thing. I wonder if there are other places in the system where the dma will walk off of a unaligned buffer.

espressif / esp-idf