espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.87k stars 7.32k forks source link

I2C Slave Driver: Predictable Read Response when TX FIFO Empty (IDFGH-12346) #13377

Open haydenroche5 opened 8 months ago

haydenroche5 commented 8 months ago

Is your feature request related to a problem?

There's a known problem with the I2C slave driver where it responds to master reads with garbage if the slave's TX FIFO is empty:

This is easily reproducible with the following application running on an ESP32-S3:

#include "driver/i2c_slave.h"

void app_main(void)
{
    i2c_slave_config_t slaveConfig = {
        .sda_io_num = 8,
        .scl_io_num = 9,
        .addr_bit_len = I2C_ADDR_BIT_LEN_7,
        .clk_source = I2C_CLK_SRC_DEFAULT,
        .i2c_port = 0,
        .send_buf_depth = 256,
        .intr_priority = ESP_INTR_FLAG_LOWMED,
        .slave_addr = 0x17,
    };

    i2c_slave_dev_handle_t slaveHandle;
    esp_err_t espErr = i2c_new_slave_device(&slaveConfig, &slaveHandle);
    if (espErr != ESP_OK) {
        printf("i2c_new_slave_device failed!\n");
        return;
    }

    printf("I2C ready\n");

    while (true){
        vTaskDelay(1000 / portTICK_PERIOD_MS);
    }
}

If I then connect another MCU to the I2C bus as the master and have it read 2 bytes from the slave, the I2C slave responds with garbage:

garbage_response

In this case, the bytes are 0x2F 0xFF, but they can be anything. It's not predictable (i.e. the response is garbage). For the protocol my team has built on top of I2C, this is not acceptable. For our case, if there's nothing available to read, the slave needs to respond with 2 0x00 bytes. Until recently, there was no way to do this with the IDF.

Describe the solution you'd like.

The user should be able to respond to this scenario in a predictable way.

Describe alternatives you've considered.

Now, there is a way, but it still requires modifying the IDF code itself. The slave driver now supports clock stretching and its associated interrupts. Perplexingly, this isn't supported for the ESP32-S3, even though there's a whole section about it in the technical reference manual. So the first thing I had to do was alter soc_caps.h:

diff --git a/components/soc/esp32s3/include/soc/soc_caps.h b/components/soc/esp32s3/include/soc/soc_caps.h
index ddc9edb9e8..f37e005ed8 100644
--- a/components/soc/esp32s3/include/soc/soc_caps.h
+++ b/components/soc/esp32s3/include/soc/soc_caps.h
@@ -209,6 +209,7 @@
 #define SOC_I2C_SUPPORT_10BIT_ADDR  (1)
 #define SOC_I2C_SLAVE_SUPPORT_BROADCAST    (1)
 #define SOC_I2C_SLAVE_SUPPORT_I2CRAM_ACCESS   (1)
+#define SOC_I2C_SLAVE_CAN_GET_STRETCH_CAUSE (1)

 /*-------------------------- I2S CAPS ----------------------------------------*/
 #define SOC_I2S_NUM                 (2U)

Now, the technical reference manual also states that there's an interrupt that should fire when the slave needs to respond to a master's read request, but the TX FIFO is empty. Quoting section 27.4.3:

RAM being empty: The slave is sending data, but its TX RAM is empty

That's exactly what I was looking for. However, if you install a clock stretching callback using i2c_slave_register_event_callbacks, the clock stretching reason register NEVER indicates this as the reason, even when the TX FIFO is in fact empty. The only reason you'll ever see in this case is I2C_SLAVE_STRETCH_CAUSE_ADDRESS_MATCH. There's never a I2C_SLAVE_STRETCH_CAUSE_TX_EMPTY.

However, we can work with this. Again quoting that section from the manual:

Address match: The address of the slave matches the address sent by the master via the SDA line, and the R/W bit is 1

When the R/W bit is 1, that means the master is doing a read. So we can use the clock stretching interrupt to detect when the master is reading, check if the TX FIFO is empty, and, if it is, stuff in 0 bytes, as required by our protocol. I hoped this would be achievable using the clock stretching callback, but it's not. I need access to i2c_ll_get_txfifo_len to check if the FIFO is empty and i2c_ll_write_txfifo to push 0s into the FIFO if it is. Neither of these are accessible via application code. So I had to modify s_i2c_handle_clock_stretch directly, like this:

diff --git a/components/driver/i2c/i2c_slave.c b/components/driver/i2c/i2c_slave.c
index e3bae30bc4..8f609823e3 100644
--- a/components/driver/i2c/i2c_slave.c
+++ b/components/driver/i2c/i2c_slave.c
@@ -40,16 +40,29 @@ static const char *TAG = "i2c.slave";

 static esp_err_t i2c_slave_bus_destroy(i2c_slave_dev_handle_t i2c_slave);

+DRAM_ATTR static const uint8_t zeros[2] = {0x00, 0x00};
+
 #if SOC_I2C_SLAVE_CAN_GET_STRETCH_CAUSE
 static IRAM_ATTR void s_i2c_handle_clock_stretch(i2c_slave_dev_handle_t i2c_slave)
 {
     i2c_hal_context_t *hal = &i2c_slave->base->hal;
+    i2c_slave_stretch_cause_t stretch_cause;
+    i2c_ll_slave_get_stretch_cause(hal->dev, &stretch_cause);
+
     if (i2c_slave->callbacks.on_stretch_occur) {
-        i2c_slave_stretch_event_data_t evt = { 0 };
-        i2c_hal_context_t *hal = &i2c_slave->base->hal;
-        i2c_ll_slave_get_stretch_cause(hal->dev, &evt.stretch_cause);
+        i2c_slave_stretch_event_data_t evt = { .stretch_cause = stretch_cause };
         i2c_slave->callbacks.on_stretch_occur(i2c_slave, &evt, i2c_slave->user_ctx);
     }
+
+    if (stretch_cause == I2C_SLAVE_STRETCH_CAUSE_ADDRESS_MATCH || stretch_cause == I2C_SLAVE_STRETCH_CAUSE_TX_EMPTY) {
+        uint32_t len = 0;
+        i2c_ll_get_txfifo_len(hal->dev, &len);
+        bool empty = (len == SOC_I2C_FIFO_LEN);
+        if (empty) {
+            i2c_ll_write_txfifo(hal->dev, zeros, sizeof(zeros));
+        }
+    }
+
     i2c_ll_slave_clear_stretch(hal->dev);
 }
 #endif

Now, the response to a master read when there's nothing to send is 0x00 0x00, as required by our protocol:

predictable_response

Additional context.

I don't know how you all want to handle this, but this is clearly something users have wanted over the years. My team's products all have to act as I2C slaves, and our SKUs that use ESP32 have been hamstrung by this deficiency in the IDF for a long time now. I would just ask that you consider making what I've done above possible in application/user code, without requiring any modifications to the IDF itself. Our application frequently reads and writes to external PSRAM, so we are subject to long delays where the I2C slave TX FIFO will inevitably become empty, as the code that feeds this FIFO is also in PSRAM.

zfields commented 7 months ago

@igrr Can you offer any advice here?

My apologies, I wouldn't normally @ you directly, but we are trying to make a decision about the future of a product. It would be nice to hear from someone on the Espressif side before we make any decisions.

mythbuster5 commented 7 months ago

I have already have a solution and it's now in reviewing internally. If you are urgent. I can provide a patch for you late this day. @haydenroche5

haydenroche5 commented 7 months ago

I have already have a solution and it's now in reviewing internally. If you are urgent. I can provide a patch for you late this day. @haydenroche5

Sure I'd be interested in the patch.

mythbuster5 commented 7 months ago

i2c_stretch_patch.txt Try is if it's meet your requirement or not. Or any suggestion.

haydenroche5 commented 7 months ago

Thanks @mythbuster5! I haven't had a chance to try the patch out yet, but just looking at the diff, it looks promising.

haydenroche5 commented 7 months ago

I noticed a potential problem in the patch, which is that in test_i2c_stretch.c, the stretch callback, test_i2c_stretch_callback, calls functions not in IRAM:

So this could cause a stall if these functions are in PSRAM, I think.

schveiguy commented 3 months ago

Am running into the same problem. Thanks for posting the code changes. I'm using the S2, but it too lists clock stretching as being supported. But I think I can port your changes to the S2 HAL.

However, in the S2, the clock stretch reason function returns void:

https://github.com/espressif/esp-idf/blob/d7ca8b94c852052e3bc33292287ef4dd62c9eeb1/components/hal/esp32s2/include/hal/i2c_ll.h#L310-L313

Unsure why that is. I'm going to at least try to interact with the hardware as specified...

Without clock stretching it's not a viable slave device.

schveiguy commented 3 months ago
i2c_ll_slave_clear_stretch(hal->dev);

Brand new to this chip, but does this clear the stretch? If so, you have to put in the bytes before you do this. Isn't it better to clear the stretch interrupt? Or am I misreading what this does?

EDIT: OK, I think this is already being done by the ISR. But yeah, I don't see why you clear the stretch condition here.

schveiguy commented 3 months ago

OK, now I see, that's what's in the code already. I think this is a mistake, at least for the TX buffer empty clock stretch. The correct way to clear this is to put in bytes to send.

I think I get the picture now of what needs to happen. If I can get this to work, I'll post back.

haydenroche5 commented 3 months ago

@schveiguy For what it's worth, I ended up giving up on trying to get the driver to work for our use case. I pulled in the Arduino ESP32 I2C slave driver code, and immediately recognized that the code addressed several of the bugs/problems in the IDF driver. The support for clock stretching is much better, too. It would be great to get the IDF driver on par with the Arduino one, but if time is of the essence, I recommend checking out the Arduino driver.

schveiguy commented 3 months ago

I think it depends on how the data to transmit/receive is populated. Since I think in IDF, you don't directly write to the hardware buffer, you write to a queue, which is then pulled into the buffer on a separate task, then you are going to miss the window.

Between the ACK of the read command and the clock for the first data byte, it's only one clock cycle. You have to stretch the clock until that hardware buffer is filled. I used to work on a device (STMicro 8-bit) that basically interrupted until you put in the next byte to send, and that is when it knew to stop stretching the clock.

TBH, the hardware spec is very complicated, but I think this might work. Going to try it now.

It is good to know there is another option that might work though, thanks for pointing it out @haydenroche5!

schveiguy commented 3 months ago

Ugh it just gets worse and worse -- so let's say I receive data on the slave. I pass in a buffer to fill, and a length. But on the callback it doesn't tell me how much data the master actually sent! it just gives me a buffer pointer.

I'm looking at the example on the espressif web site, and it creates a variable for the received data length but never uses it.

No serious person is using this driver for anything, is my assertion.

EDIT: Yep: #13954

schveiguy commented 3 months ago

So, I ended up writing my own custom driver. The driver in IDF is completely incorrect. In addition to the above mentioned issues:

  1. Interrupts are cleared before being handled
  2. Clock stretching is cleared before being handled
  3. Interrupts are only enabled when code has called read/write. i.e. if a master addresses your device before you call read, there is a chance you will miss the message
  4. When data is in the buffer, and you call read, you don't get data back until an interrupt occurs!
  5. To that end, stale data in the buffer is sitting there waiting to mess up your next transaction.

It looks like the driver was written by someone who doesn't understand how i2c slave events happen -- you need to handle events as the master addresses you, not with calls from outside the ISR.

My new driver is very simple:

  1. Interrupts when stop condition is detected, and calls a callback with the result of the transaction
  2. Interrupts when clock stretching occurs (yes, it works), and calls a callback requesting more data to send. It does not clear the clock stretch until more data is put in the fifo.
  3. If when the slave is addressed for sending (i.e. a slave read), if there is data that we sent, the callback for received data is called first.
  4. Interrupts are cleared after the handlers are all called (otherwise you get duplicate interrupts).

And that's it. All hardware and driver buffers are cleared after the callback for completion. If you want the data outside the ISR, it's on you to store it for later processing.

Works correctly, will be building the rest of my code on top of it. I don't know if I can contribute it back (is a project for a client), but of course, the entire slave API would have to change.

But if anyone has questions about how it works, happy to answer. The hardware spec is very cryptic and not a lot of examples that match interrupts to the protocol. It also has some unexpected things -- like there is no interrupt when the device is addressed (unless it's a slave read), and you get an interrupt for a stop condition even if your device wasn't addressed. I basically had to do a lot of trial and error.

VectorDominik commented 2 months ago

Hi Schveiguy I read this thead carefully and I face the same problems. I had a running I2C-Slave code on ESP-IDF V5.0 with an ESP32-C3. Then I updated the code using the new I2C slave driver from ESP-IDF v5.3. Is there any chance to share your custom, updated driver? So far I was not able to fix it myself. Problems:

  1. I2C Slave returns random value instead of NACK when I don't want to send anything but the I2C master addressed me with an I2C Master-Read Command.
  2. The I2C slave receive function with i2c_slave_receive and xQueueReceive sometimes work but not consistently. There is no function to empty the fifo anymore so the routine does not work smoothly anymore.

Thank you for your hints.

schveiguy commented 2 months ago

Is there any chance to share your custom, updated driver? So far I was not able to fix it myself.

I can ask if it's OK to share. This is part of a proprietary project, not under my control.

The stock esp-idf i2c slave driver is not usable, the design of it is fundamentally flawed (neither the 5.3 version or the previous design). If you want to use the device as a slave, you will need to write your own driver or find a driver code that is properly designed. There is mention up above that an arduino driver is available that might help, but I haven't looked at it.

VectorDominik commented 2 months ago

That would be great if you could ask. I agree with you that the I2C Slave always had some problems with ESP-IDF. I have used the I2C Slave with previous esp-idf versions with some hacks that it worked well but there have always been some cases where errors or NACK have not been handled correctly. There are several problems. I will look into the Arduino I2C slave driver in order to see if I can use it on my esp-idf project. I hope the driver will be fixed on later version of esp-idf. I already lost too much time on that part of the software.

apppie123 commented 2 months ago

Thanks for these posts, apparently it isn't me. I've been wasting two days on this.

I hope one day it'll work, since this is a stopper for me to use the esp32 in this project.

schveiguy commented 2 months ago

I got the OK to post the driver. I need to make sure it has appropriate copyright (will be Apache licensed), and make sure there's nothing specific to this project in there. It is very specific to the esp32s2 chip I'm using (Because the HAL functions for clock stretching are not correct). But you can probably rewrite it to your chip. I'll try to get it up this weekend.

VectorDominik commented 2 months ago

That's great. Thank you so much for your support. Does your updated driver only works correctly with clock stretching enabled?

schveiguy commented 2 months ago

Correct. A slave driver won’t work without stretching unless you have all the data to send already in the fifo before the master addresses you.

sorry I didn’t get the driver up this past weekend. I unfortunately am on a trip and my work laptop has the code. I’ll see if I can get it. But it may be next week before I can post it.

apppie123 commented 2 months ago

I tried this one this weekend and works for my application: https://github.com/leeebo/esp_i2c_slave

Its a modified version of the arduino one.

VectorDominik commented 2 months ago

Thanks again for your efforts. I have the following problem: The I2C master (which is not under my control) does not support clock stretching. So it just starts reading. The I2C Master continuosly "polls" with "Master Read" Commands and if the I2C Slave (ESP32-C3) wants something, it can send an ACK on the Master Read Request, otherwise a NACK. What happens now is that the ESP32-C3 slave also sends an ACK and returns a random byte when the txbuffer is empty. I was unable to avoid this. Earlier I tried to delete and re-init the complete I2C driver after each command. But it also caused problems as the tx buffer can only be filled when the I2C driver is already initiated. I am quite sure that the clock stretching method would solve the problem but in my case it might not work. Is there any other possibility to temporatily disable the I2C-Slave driver to force the return of a NACK until the buffer is filled and ready to be clocked out?

Here a I2C log image what happens after a Master read on Address 0x61. The ESP32-C3 immediately sends an ACK and returns the just received address byte (0x61 read = 0xC3):

image

I think this is an error in the ESP32-C3 I2C Slave as I would expect that a NACK is returned when the buffer was not filled.

schveiguy commented 2 months ago

Sorry for the delay, I did not get a chance to post the code while on the trip, I will post it soon.

The I2C master (which is not under my control) does not support clock stretching.

@VectorDominik I don't think this is possible. I2C is open drain with pullup signal, which means that any device on the bus can pull down the clock. It's impossible for the master to run the clock signal when someone else is pulling on it.

Your trace looks like the clock is being pulled down for the clock stretch (that long low period in the clock), but then it releases before the data is ready to be transmitted. This is consistent with the current released driver -- it clears the clock stretch before filling the FIFO with data (see here). You need to clear the the clock stretch only after you have data to send. As soon as the clock is released, the master will start driving the clock, expecting the slave is transmitting data.

This is always how the master/slave negotiation works! It's inherent in I2C.

schveiguy commented 2 months ago

I tried this one this weekend and works for my application

Yes, that driver has the key difference that it only clears the clock stretching after data is written. It's more complex than mine, mine does everything from the ISR, and it's on you to build a queue for the messages how you want to process them, or handle them fully in the interrupt. But it is similar in how it handles the clock stretching (good validation that I found the right mechanism). However, it is also clearing the interrupt masks before handling, so you will get repeat "ghost" interrupts.

schveiguy commented 2 months ago

Just published my driver code: https://github.com/schveiguy/espi2cslave/

Please focus any questions about it in that repo.

VectorDominik commented 2 months ago

Thank you very much. I will give it a try this week. Very appreciated that you shared your driver code!

zfields commented 2 months ago

@igrr Are you guys working on fix for the ESP-IDF any time soon?

Regardless, this is becoming a useful reference guide, and may help influence or guide the official implementation.