espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.45k stars 7.25k forks source link

I2C master semaphore not released (IDFGH-13014) #13962

Closed combalafra01 closed 1 month ago

combalafra01 commented 3 months ago

Answers checklist.

IDF version.

5.2.2

Espressif SoC revision.

ESP32-S3

Operating System used.

Windows

How did you build your project?

Command line with Make

If you are using Windows, please specify command line type.

None

Development Kit.

ESP32-S3-DevKitC-1

Power Supply used.

USB

What is the expected behavior?

in s_i2c_send_commands release semaphore i2c_master->cmd_semphr when status is I2C_STATUS_ACK_ERROR

What is the actual behavior?

semaphore not released, looping error until reboot

Steps to reproduce.

look at the code, no need to reproduce

Debug Logs.

No response

More Information.

No response

combalafra01 commented 3 months ago

xSemaphoreGive should be called before return.

    if (i2c_master->status == I2C_STATUS_ACK_ERROR) {
      ESP_LOGE(TAG, "I2C hardware NACK detected");
      const i2c_ll_hw_cmd_t hw_stop_cmd = {
          .op_code = I2C_LL_CMD_STOP,
      };
      i2c_ll_master_write_cmd_reg(hal->dev, hw_stop_cmd, 0);
      i2c_hal_master_trans_start(hal);
      return;
    }
congyue commented 3 months ago

Not Espressif person, but won't the semaphore be released at later point? I don't see other exit point when I2C_STATUS_ACK_ERROR is hit.

congyue commented 3 months ago

OK looks like an issue that is fixed on main (after a refactor) but still exist on 5.2.2

congyue commented 3 months ago

Maybe you can try manually apply this one?

combalafra01 commented 3 months ago

OK so I have added the 'break' instead of the 'return' to the if for I2C_STATUS_ACK_ERROR and it does not solve my infinite loop problem but I think I have found the problem and knows the fix. In fact it is related to the issue #13627 I have created in April: whes slave shuts down or fails for whatever reason and is up again later, the master cannot detect it and enters an infinite loop of timeout.

The fix is to add a xSemaphoreGive(i2c_master->cmd_semphr); in the if related to the xSemaphoreTake because this xSemaphore take is in a while loop. So it if the semaphore is taken correctly on a first iteration of the loop and an error occurs with the slave, the next loop will timeout on the semaphore (because it was taken in the first loop and never released) and returns. And if we release the semaphore before returning, it seems to work.

The code that release the semaphore looks like a complex system between command senders and interupts or events called so I am not 100% sure but looks like with this semaphore release added the master now detects that the slave is back.

code to fix is the following

if (xSemaphoreTake(i2c_master->cmd_semphr, ticks_to_wait) != pdTRUE) { // Software timeout, clear the command link and finish this transaction. i2c_master->cmd_idx = 0; i2c_master->trans_idx = 0; atomic_store(&i2c_master->status, I2C_STATUS_TIMEOUT); xSemaphoreGive(i2c_master->cmd_semphr); return; }

AxelLin commented 2 months ago

OK looks like an issue that is fixed on main (after a refactor) but still exist on 5.2.2

v5.2 fix: 34abdaea46ef44c585d1c66772cd336eac618df1

AxelLin commented 2 months ago

code to fix is the following

if (xSemaphoreTake(i2c_master->cmd_semphr, ticks_to_wait) != pdTRUE) { // Software timeout, clear the command link and finish this transaction. i2c_master->cmd_idx = 0; i2c_master->trans_idx = 0; atomic_store(&i2c_master->status, I2C_STATUS_TIMEOUT); xSemaphoreGive(i2c_master->cmd_semphr); return; }

Fixed by: 4ce9b783f3c658d1752d836505c0fd46d67a3d52