espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.49k stars 7.26k forks source link

SPI Master misses data on ESP32-S3 when max xfer size is more than 32KB (IDFGH-6125) #7804

Closed huming2207 closed 2 years ago

huming2207 commented 2 years ago

Environment

Problem Description

The SPI Master driver generates rubbish data on ESP32-S3 when the max transfer size is more than 32768 bytes.

We have a project that driving some display panels. The SPI Master's max_transfer_sz was set to 100KB and it was working fine on the classic ESP32. However it doesn't work when we move to ESP32-S3, even if CONFIG_SPIRAM_MALLOC_RESERVE_INTERNAL is set to 100KB (in case it causes some unexpected behaviour from the PSRAM).

Expected Behavior

Should just work as-is.

Actual Behavior

On every SPI transaction that SPI master driver has been set to max_transfer_sz > 32768 and transaction data length is > 32768, the driver seems only transfer the first 32KB data, and ignore the rest of them.

Steps to reproduce

  1. Set the max_transfer_sz at spi_bus_config_t in the SPI driver initialisation function to something bigger than 32768
  2. Transfer a big chunk of data
  3. Take a look for the output with a logic analyser.

Code to reproduce this issue

Code is not allowed to be disclosed.

Debug Logs

N/A

Other items if possible

N/A

huming2207 commented 2 years ago

Here are more info:

huming2207 commented 2 years ago

@Alvin1Zhang any updates? Do I need to provide more info?

Alvin1Zhang commented 2 years ago

@huming2207 Thanks for reporting, and sorry for slow turnaround, we will look into and provide feedback, thanks.

Icarus113 commented 2 years ago

Hi @huming2207 , your conservation is correct. Due to registers bit width limitation, ESP32S3 SPI Master is able to transfer 32KB per transaction.

Besides, you may use keep_cs_active feature. See this. In this way, cs is kept active between 2 transactions, although it still costs some time on preparing the hardware (you should see clock signal keeping idle for some time between 2 transactions)

huming2207 commented 2 years ago

@Icarus113 Thanks. In our case, we are controlling CS pins by ourselves anyway. I guess I should queue a few transfers and wait for complete in the end instead.

By the way, may I ask which register bit width has that limitation? Is there a way to mitigate it by implementing a workaround in the SPI driver?

Icarus113 commented 2 years ago

@huming2207 Hi, see this

We do have a hardware feature which is called Configure-Transfer(only under DMA mode), during configure, software can decide the length of the transaction, and need to prepare buffer in this length. Let's say SPI is configured to send 64KB, software should prepare this buffer, and use 2 DMA descriptors. Software only needs to trigger SPI once, and 3 transactions (1 configure transaction + 2 general transactions) will be started. During the configure transaction, you will only see clock signal keeping active level, and cs signal keeping inactive. Software doesn't support this feature yet.

huming2207 commented 2 years ago

@Icarus113 thanks. Then I will probably queue the SPI transfer and then wait for the result at the last one for now until Espressif implements that. Or maybe I will see if I can do it later on if I have time to do so, but it's not the highest priority for us for now.

Alvin1Zhang commented 2 years ago

Thanks for reporting, feel free to reopen.

huming2207 commented 2 years ago

@Alvin1Zhang I think this issue is not resolved yet. Please reopen it.

And indeed it is a bug, as the SPI master driver has different behaviour on transfer length between ESP32-S3 and the classic ESP32.

Icarus113 commented 2 years ago

Hi @huming2207 , ESP32 also has this limitation, although the length is higher than it is on ESP32S3. This is the hardware design instead of a driver bug. Indeed we can have a wrapper to combine few transactions, but you'll still see the CS line changing between active and inactive between each transaction.

By using keep_cs_active feature, it's true the CS will be kept active, but hardware still needs time to prepare its state. Besides, if the configure-transfer feature is needed, you may open another issue for a dedicated feature request (about the configure-transfer feature)

huming2207 commented 2 years ago

@Icarus113 Yep okay, I will open another feature request for this configure-transfer mode. But nevertheless, I believe this indeed IS also a bug as the same API behaves differently on ESP32 and ESP32-S3 without any warning or panic.

huming2207 commented 2 years ago

I woke up early today and I started revising this issue, as it is indeed necessary for improving the performance for driving some displays.

So far I couldn't find any info of config-transfer mode like you said on ESP32-S3's TRM, because so far (as per 3 Dec 2021 7:00am AEDT) in the manual, SPI is not yet being documented. But I found something similar on ESP32-S2's TRM at Section 24.4.7 "DMA Controlled Segmented-Configure-Transfer". Then I referred to the header at components/soc/esp32s3/include/soc/spi_struct.h and there's a dma_seg_trans_done in dma_int_ena, and some other control registers there as well.

Apparently, ESP32-S2's SPI peripheral is quite different from ESP32-S3. But before Espressif releases their documentation for ESP32-S3, I guess I can probably have a try on ESP32-S3 later following the ideas from ESP32-S2's TRM.

@Icarus113 Do you think I'm on the right track? Any further suggestions? Also, @Alvin1Zhang if there's no one currently working on this issue on your side, then I will consider starting my experiment this month and submit a PR if I manage to get it working.

huming2207 commented 2 years ago

For this table in ESP32-S2 TRM:

image

I guess the sequence for ESP32-S3 is different, so the 7th bit (offset 0x1c of SPI peripheral registers) is SPI_MS_DLEN_REG, which I need to set the ms_data_bitlen in the configuration buffer in the DMA linked list. Am I correct?

maplerian commented 2 years ago

I also encounter the same problem when using SPI LCD. My solution is to split a frame of data and queue it, which can solve the problem that a frame is truncated. Then I found the drawback of this approach (the screen was split). For this, I used TE to synchronize every frame, which solved the problem and made the screen look like there was no problem. Of course, there is a premise that the frequency of SPI needs to be at least 60MHZ (the screen is only tested at 240: 240). ` static esp_err_t panel_io_spi_tx_color(esp_lcd_panel_io_t io, int lcd_cmd, const void color, size_t color_size) { esp_err_t ret = ESP_OK; spi_transaction_t spi_trans = NULL; lcd_spi_trans_descriptor_t lcd_trans = NULL; esp_lcd_panel_io_spi_t *spi_panel_io = __containerof(io, esp_lcd_panel_io_spi_t, base);

// before issue a polling transaction, need to wait queued transactions finished
for (size_t i = 0; i < spi_panel_io->num_trans_inflight; i++) {
    ret = spi_device_get_trans_result(spi_panel_io->spi_dev, &spi_trans, portMAX_DELAY);
    ESP_GOTO_ON_ERROR(ret, err, TAG, "recycle spi transactions failed");
}
spi_panel_io->num_trans_inflight = 0;
lcd_trans = &spi_panel_io->trans_pool[0];
memset(lcd_trans, 0, sizeof(lcd_spi_trans_descriptor_t));
spi_lcd_prepare_cmd_buffer(spi_panel_io, &lcd_cmd);
lcd_trans->base.user = spi_panel_io;
lcd_trans->flags.dc_gpio_level = !spi_panel_io->flags.dc_data_level; // set D/C line to command mode
lcd_trans->base.length = spi_panel_io->lcd_cmd_bits;
lcd_trans->base.tx_buffer = &lcd_cmd;
if (spi_panel_io->flags.dc_as_cmd_phase) { // encoding DC value to SPI command phase when necessary
    lcd_trans->base.cmd = !spi_panel_io->flags.dc_data_level;
}
if (spi_panel_io->flags.octal_mode) {
    // use 8 lines for transmitting command, address and data
    lcd_trans->base.flags |= (SPI_TRANS_MULTILINE_CMD | SPI_TRANS_MULTILINE_ADDR | SPI_TRANS_MODE_OCT);
}
// command is short, using polling mode
ret = spi_device_polling_transmit(spi_panel_io->spi_dev, &lcd_trans->base);
ESP_GOTO_ON_ERROR(ret, err, TAG, "spi transmit (polling) command failed");

// sending LCD color data

ifdef CONFIG_IDF_TARGET_ESP32S3

uint32_t size = 0;
uint16_t index = 0;
do
{
    lcd_trans = &spi_panel_io->trans_pool[index];
    memset(lcd_trans, 0, sizeof(lcd_spi_trans_descriptor_t));
    lcd_trans->base.user = spi_panel_io;
    lcd_trans->flags.dc_gpio_level = spi_panel_io->flags.dc_data_level; // set D/C line to data mode
    if (spi_panel_io->flags.dc_as_cmd_phase) { // encoding DC value to SPI command phase when necessary
        lcd_trans->base.cmd = spi_panel_io->flags.dc_data_level;
    }
    lcd_trans->base.tx_buffer = &color[size];
    if (size + 32768 < color_size)
    {
        size += 32768;
        lcd_trans->base.length = 32768 * 8; // transaction length is in bits
    }
    else
    {
        lcd_trans->flags.trans_is_color = 1;
        lcd_trans->base.length = (color_size - size) * 8; // transaction length is in bits
        size = color_size;
    }
    // color data is usually large, using queue+blocking mode
    ret = spi_device_queue_trans(spi_panel_io->spi_dev, &lcd_trans->base, portMAX_DELAY);
    ESP_GOTO_ON_ERROR(ret, err, TAG, "spi transmit (queue) color failed");
    spi_panel_io->num_trans_inflight++;
    index++;
} while (size < color_size);

else

lcd_trans->flags.trans_is_color = 1;
lcd_trans->flags.dc_gpio_level = spi_panel_io->flags.dc_data_level; // set D/C line to data mode
lcd_trans->base.length = color_size * 8; // transaction length is in bits
lcd_trans->base.tx_buffer = color;
if (spi_panel_io->flags.dc_as_cmd_phase) { // encoding DC value to SPI command phase when necessary
    lcd_trans->base.cmd = spi_panel_io->flags.dc_data_level;
}
// color data is usually large, using queue+blocking mode
// ret = spi_device_polling_transmit(spi_panel_io->spi_dev, &lcd_trans->base);
ret = spi_device_queue_trans(spi_panel_io->spi_dev, &lcd_trans->base, portMAX_DELAY);
ESP_GOTO_ON_ERROR(ret, err, TAG, "spi transmit (queue) color failed");
spi_panel_io->num_trans_inflight++;

endif

err: return ret; } `

fschuetz commented 2 years ago

I encountered the same problem. I was trying to transfer 40960 bytes. spi_device_polling_transmit() returns ESP_OK even though it does not transmit all data as described above. i would expect an error code. Can you explain the behaviour?

On another note: since this issue exists a while: is there a workaround on driver level to transfer bigger loads or do we still need to segment at user level?