about SPI slave setup and post operation (IDFGH-8387)

weizzh commented 2 years ago

Answers checklist.

[X] I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
[X] I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
[X] I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

Hi: my sdk:IDF 4.4, my hardware:ESP32-S3

I use esp32-s3 as spi slave to receive data from an msp430 controller. Basically the application is modified according to the spi_slave receiver example code. The msp430 shall produce an pulse(the DATAREADY io, see the pic below) every 3.4ms. After detecting this signal, esp32-s3 will pull down the HANDSHAKE line. The msp430 will start an spi transaction, which length is 196 bytes. Esp32 put these data into bufferes, when one of them is full, the data will be transmitted through wifi.

At first I only got timeout results. Then I tried to controll the handshake io manually(see the code below), and this time I got two timeout errors at the beginning and following transactions are ok. Well now, I find there is still a bug there: very accasionally(about several seconds or more), the several bytes at the beginning of one transation is "overwritten" by the fowllowing transation. With the help of a logic analyzer, I found some abnornal handshape signals that may caused this problem.

This is one abnormal handshake signal, witch last much longer than normal ones. abnormal_handshake

Zoom in these signals: normal signals: normal_handshake

abnormal handshake signals: abnormal_handshake_zoom_in

After some calculation, now I'm very sure that when abnormal handshake signales happen, in the handshake "low" duration, those bytes being transed "overwrite" the begining bytes of the former transaction. My code:

//Called after a transaction is queued and ready for pickup by master. We use this to set the handshake line high.
IRAM_ATTR void my_post_setup_cb(spi_slave_transaction_t *trans)
{
    WRITE_PERI_REG(GPIO_OUT_W1TC_REG, (1<<GPIO_HANDSHAKE));
}
/**
 * @description: when trans is over, set HANDSHAKE line back to high.
 * @param {spi_slave_transaction_t} *trans
 * @return {*}
 */
IRAM_ATTR void my_post_trans_cb(spi_slave_transaction_t *trans)
{
    WRITE_PERI_REG(GPIO_OUT_W1TS_REG, (1<<GPIO_HANDSHAKE));
}

void IRAM_ATTR gpio_data_ready_isr(void* arg)
{
    WRITE_PERI_REG(GPIO_OUT_W1TS_REG, (1<<GPIO_SCOPE0));

    static uint32_t lastdatareadytime;
    uint32_t currtime = esp_cpu_get_ccount();
    uint32_t diff = currtime-lastdatareadytime;
    if(diff<240000) return;
    lastdatareadytime = currtime;
    BaseType_t high_task_awoken=pdFALSE;

    vTaskNotifyGiveFromISR(msp430_task_handle, &high_task_awoken);

    WRITE_PERI_REG(GPIO_OUT_W1TC_REG, BIT(GPIO_SCOPE0));
    // return (high_task_awoken==pdTRUE);    
    if(high_task_awoken)
    {
        portYIELD_FROM_ISR();
    }
}

void msp430_task(void * pvParameters)
{

    esp_err_t ret;
    spi_slave_transaction_t t;
    memset(&t, 0, sizeof(t));
    int n =0;
    t.length = 196*8;
    t.tx_buffer = sendbuf;
    t.rx_buffer = recvbuf;
    uint32_t bytes_read_from_msp430=0; //读入buf的字节数
    uint32_t msp430_buf_cnt = 0;// msp430 buf index.
    while(1)
    {

        ulTaskNotifyTake(pdTRUE, portMAX_DELAY);
        WRITE_PERI_REG(GPIO_OUT_W1TS_REG, (1<<GPIO_SCOPE0));
        n++; 
        // printf("\n Received %d interrupt.\n", n);
        // memset(recvbuf, 0, sizeof(recvbuf));
        WRITE_PERI_REG(GPIO_OUT_W1TC_REG, (1<<GPIO_HANDSHAKE));
        ret = spi_slave_transmit(RCV_HOST, &t, 10/portTICK_RATE_MS);//timeout for 10ms
        // ret = spi_slave_transmit(RCV_HOST, &t, portMAX_DELAY);
        WRITE_PERI_REG(GPIO_OUT_W1TC_REG, BIT(GPIO_SCOPE0));
        if(ret != ESP_OK)
        {
            time(&now);
            localtime_r(&now, &timeinfo);
            strftime(strftime_buf, sizeof(strftime_buf), "%c", &timeinfo);
            printf("%s :spi slave recv timeout.\n", strftime_buf);
            WRITE_PERI_REG(GPIO_OUT_W1TS_REG, (1<<GPIO_HANDSHAKE));

            continue;
        }
        WRITE_PERI_REG(GPIO_OUT_W1TS_REG, (1<<GPIO_HANDSHAKE));
    /************************************************put data to buffer, when buffer is full transmit them using wifi********************/
    xTaskNotify(publish_task_handle, msp430_buf_cnt, eSetValueWithoutOverwrite);//publish_task is responsable for transmitting data.
}

some init config:

    BaseType_t xStatus;
    xStatus = xTaskCreate(msp430_task, "MSP430Task", 4096, NULL, 19, &msp430_task_handle);
    if(xStatus==pdPASS)
    {
        printf("Create MSP430Task OK. \n");
    }else
    {
        printf("Create MSP430Task failed.\n");
    }

    xStatus = xTaskCreate(publish_task, "publishTask", 4096, NULL, 6, &publish_task_handle);
    if(xStatus==pdPASS)
    {
        printf("Create publishTask OK. \n");
    }else
    {
        printf("Create publishTask failed.\n");
    }

    esp_err_t ret;
    spi_bus_config_t buscfg =
    {
        .mosi_io_num = GPIO_MOSI,
        .miso_io_num = GPIO_MISO,
        .sclk_io_num = GPIO_SCLK,
        .quadwp_io_num = -1,
        .quadhd_io_num = -1,
    };
    spi_slave_interface_config_t slvcfg =
    {
        .mode=1,
        .spics_io_num = GPIO_CS,
        .queue_size = 3,
        .flags = 0,
    //    .post_setup_cb = my_post_setup_cb, //NB. callbacks not work.
    //    .post_trans_cb = my_post_trans_cb
    };
    //Enable pull-ups on SPI lines so we don't detect rogue pulses when no master is connected.
    gpio_set_pull_mode(GPIO_MOSI, GPIO_PULLUP_ONLY);
    gpio_set_pull_mode(GPIO_SCLK, GPIO_PULLUP_ONLY);
    gpio_set_pull_mode(GPIO_CS, GPIO_PULLUP_ONLY);

    //Initialize SPI slave interface
    // ret=spi_slave_initialize(RCV_HOST, &buscfg, &slvcfg, SPI_DMA_CH_AUTO);
    ret=spi_slave_initialize(RCV_HOST, &buscfg, &slvcfg, SPI_DMA_CH_AUTO);
    if(ret==ESP_OK)
    {
        printf("spi slave initialized ok.\n");
    }else
    {
        ESP_LOGE("ESP32-S3","spi slave initialized failed.");
    }

    gpio_config_t handshake_io_conf = 
    {
        .intr_type = GPIO_INTR_DISABLE,
        .mode = GPIO_MODE_OUTPUT,
        .pin_bit_mask=(1<<GPIO_HANDSHAKE)|(1<<GPIO_SCOPE0),
        .pull_up_en=1
    };
    gpio_config(&handshake_io_conf);
    //config DATA_READY interrupt io
    gpio_config_t data_ready_io_conf = 
    {
        .intr_type=GPIO_INTR_NEGEDGE,
        .mode=GPIO_MODE_INPUT,
        .pull_up_en=1,
        .pin_bit_mask=(1<<GPIO_DATA_READY)
    };
    gpio_config(&data_ready_io_conf);
    gpio_install_isr_service(ESP_INTR_FLAG_IRAM);
    gpio_set_intr_type(GPIO_DATA_READY, GPIO_INTR_NEGEDGE);
    gpio_isr_handler_add(GPIO_DATA_READY, gpio_data_ready_isr, NULL);

My questions: 1: can give some hint why setup/post transaction callback functions fail? 2:spi_slave_transmit will block or unblock? The handshake is quicklly pulled up after merely several us, so the func will not unblock? 3: Why sometimes handshake signals last much more time? My guess is the task(I have set the priority of 19) is just pre-empted by something between

    WRITE_PERI_REG(GPIO_OUT_W1TC_REG, (1<<GPIO_HANDSHAKE));
    ret = spi_slave_transmit(RCV_HOST, &t, 10/portTICK_RATE_MS);//timeout for 10ms

4: is this bug related to spi queues? How can I fix this if setup/post callback functions still not work?

wanckl commented 2 years ago

hello, sorry I can't reproduce your abnormal HANDSHAKE signals by your code. but I found something doubtable:

for your question 1: the setup/post transaction callback function will called before/after an actually transaction start by master, at least it need a down and up on the CS line, so if transmit just time out on slave, the post trans callbk will not be called, then the HANDSHEKE line will always high and have no positive edge to trig msp430 start an transaction. if you controll the handshake io manually, it still run after timeout.
besides: if you init your data rady io on msp430 with a low level, it may cheat slave a ready signal at the boot of msp430, however the msp430 is not ready to start a transaction, in this case that msp430 cheat with a ready signal but not start the transaction will trig slave into timeout (I reproduced your first issue in that case)
your question 2: spi_slave_transmit will block, but it will time out at first time in your code because you set positive edge(will trig msp430 start trans) after spi_slave_transmit, then every time call spi_slave_transmit you actually get the results before this transaction, so the function return immediately.
your question 3: you use only one buffer for both spi and wifi transactions, it is likely concurrency issues that data modified by spi during the wifi transaction, especially when cpu works on heavy load, you can use a ring buffer instead.

good luck for you.

weizzh commented 2 years ago

Thanks. While awating your response, I found a workaroud to solve these data error: I put ad related io isr and tasks on core 1. I don't know why, bu it worked. Now I'm faced another problem, this ad seemes too "sensitive": I added some print() in related task, then I watched lots of jitter; When other tasks go wrong, for example, sa card write error or wifi instablity, huge jitter appears.

spi_slave_transmit will block, but it will time out at first time in your code because you set positive edge(will trig msp430 start trans) after spi_slave_transmit, then every time call spi_slave_transmit you actually get the results before this transaction, so the function return immediately.

that's to say, though data seems like normal, the spi transactions accually work in error timing sequense.This can cause spi transactions to be affected by something others easily?

espressif / esp-idf

about SPI slave setup and post operation (IDFGH-8387) #9856

Answers checklist.

General issue report