stickbreaker / arduino-esp32

Arduino core for the ESP32
38 stars 23 forks source link

Random Timeouts #38

Closed chinswain closed 6 years ago

chinswain commented 6 years ago

Board: LOLIN32 Core Installation/update date: Main + latest stickbreaker patched i2c files IDE name: Arduino IDE Flash Frequency: 40Mhz Upload Speed: 115200

Description:

Several times an hour I get the below error on debug - even with timeouts set to 2000ms. I'm using a pair of i2c extenders (P82B715 + 6 metres of double shielded cat 6a) which may be introducing some latency.

to

Is there anything that can be done software side? rise time is below 1 μs (0.400 μs) and the cable is earthed ESP32 end, signal looks clean and stable over time.

Sketch:


#include <Wire.h>
#include <Adafruit_ADS1015.h>

Adafruit_ADS1115 ads;

void setup(void) 
{
  Serial.begin(115200);

  ads.begin();
}

void loop(void) 
{
  int16_t adc0, adc1, adc2, adc3;

  adc0 = ads.readADC_SingleEnded(0);
  adc1 = ads.readADC_SingleEnded(1);
  adc2 = ads.readADC_SingleEnded(2);
  adc3 = ads.readADC_SingleEnded(3);
  Serial.print("AIN0: "); Serial.println(adc0);
  Serial.print("AIN1: "); Serial.println(adc1);
  Serial.print("AIN2: "); Serial.println(adc2);
  Serial.print("AIN3: "); Serial.println(adc3);
  Serial.println(" ");

  delay(1000);
}

Debug Messages:

[D][esp32-hal-i2c.c:1166] i2cProcQueue(): used TimeoutRecovery: expected=0ms, actual=3ms, configured=50ms 
[E][esp32-hal-i2c.c:630] i2cDumpI2c(): i2c=0x3ffc1030
[E][esp32-hal-i2c.c:641] i2cDumpI2c(): dev=0x60013000 date=0x16042000 level=DEBUG
[E][esp32-hal-i2c.c:643] i2cDumpI2c(): lock=0x3ffc5cc8
[E][esp32-hal-i2c.c:645] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:646] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:647] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:648] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:649] i2cDumpI2c(): event=0x3ffc5e10 bits=10
[E][esp32-hal-i2c.c:650] i2cDumpI2c(): intr_handle=0x3ffc5e40
[E][esp32-hal-i2c.c:651] i2cDumpI2c(): dq=0x3ffc5dec
[E][esp32-hal-i2c.c:652] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:653] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:654] i2cDumpI2c(): byteCnt=3
[E][esp32-hal-i2c.c:604] i2cDumpDqData(): [0] 93 R STOP buf@=0x3ffc2304, len=2, pos=2, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:620] i2cDumpDqData(): 0x0000: a.                               61 c1 
[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0001 0x0000 0x00057c73
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00057c73
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0003 0x0040 0x0000 0x0000 0x00057c73
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0002 0x00057c76
stickbreaker commented 6 years ago

@chinswain This message of timeout recovery is only expressed because you have the core debug level set to DEBUG, if you reduce the Debug Level to ERROR it will not be displayed. timeout recovery debug message code:

#if ARDUHAL_LOG_LEVEL >= ARDUHAL_LOG_LEVEL_DEBUG
  uint32_t expected =(totalBytes*10*1000)/i2cGetFrequency(i2c);
  if((tAfter-tBefore)>(expected+1)) { //used some of the timeout Period
    // expected can be zero due to small packets
    log_d("used TimeoutRecovery: expected=%ums, actual=%ums, configured=%ums ",expected,(tAfter-tBefore),timeOutMillis);
    i2cDumpI2c(i2c);
    i2cDumpInts(i2c->num);
    }
#endif

There are a couple of things you could do; Delete that section of code, or edit the code to change the comparison log level from ARDUHAL_LOG_LEVEL_DEBUG to ARDUHAL_LOG_LEVEL_INFO, or reduce your Core Debug Level to Warning or Error.

In the next released version I plan on removing this message, it has served its purpose. I now have proof that my timeout recovery code works!

I created this message to verify that my driver would correctly hand SCL stretching. Most device don't use this coordination function.

Chuck.

chinswain commented 6 years ago

I see - I've been swapping pull ups and replacing cable for weeks trying to eliminate it! :)

The error is not an issue and as you say only occurs on debug, I just assumed there was a stability issue with my complex i2c setup.

Is it possible to determine why it needed to recover from a timeout?

stickbreaker commented 6 years ago

@chinswain Dissecting the relevant parts of this message:

[E][esp32-hal-i2c.c:604] i2cDumpDqData(): [0] 93 R STOP buf@=0x3ffc2304, len=2, pos=2, eventH=0x0 bits=0

You were doing a read operation on a device at 0x49 of 2 bytes.

[D][esp32-hal-i2c.c:1166] i2cProcQueue(): used TimeoutRecovery: expected=0ms, actual=3ms, configured=50ms

Based on your buss clock rate, the calculated transmission time was less than 1ms (0ms). It actually took 3ms to complete, you have the maximum timeout set to 50ms(gross timeout error limit).

[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row count INTR TX RX [E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0001 0x0000 0x00057c73 [E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00057c73 [E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0003 0x0040 0x0000 0x0000 0x00057c73 [E][esp32-hal-i2c.c:971] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0002 0x00057c76

This is the sequence of interrupt events: [01] one TX FiFo empty interrupt[0x0002] occurred at millis()=0x00057c73, the TX Fifo was loaded with 1 byte [02] one START interrupt [0x0200] occurred at millis()=0x00057c73 [03] three consecutive byte transfer interrupts [0x0040] occurred, the last happened at millis()=0x00057c73 one outbound byte, the Device Address, Two inbound bytes of received data. [04] the STOP interrupt [0x0080] occurred at millis()=0x00057c76 This is the surprising one, I don't know how the STOP signal was delayed? The STOP is generated after the read termination NAK is sent to the Slave. When the Slave device(0x49) receives the NAK, instead of sending the next byte, should immediately respond with the STOP signal. Based on this debug information, the SLAVE device held SCL low for 3ms after it received NAK, then signaled the STOP. OR: the STOP signal was generated, but the interrupt was not dispatched until 3ms later. The only reasonable explanation I can envision is some Higher Priority interrupt? Hardware interrupts are high priority, I don't know what would delay it for 3ms.

Chuck.

chinswain commented 6 years ago

I thought I would try connecting the ADS1115 directly without any of the buffering hardware. If I connect the ESP32 and any I2c slave directly over 6 metres of the shielded cat6a cable there are no timeout issues, as soon as I interface a pair of P82B715 (increases the bus current 10x) or a TCA9517 (bus extender with logic level shift, so using 5v slave side) I get the above timeouts.

Changing clock speed has no impact on the frequency of timeouts (50k to 400k) Both methods introduce some latency, could this be the cause?

stickbreaker commented 6 years ago

@chinswain I don't see how either of those devices could cause a 3ms delayed STOP signal. Neither one has any logic, they are both analog drivers. The TCA9517 spec's its delay at 250ns maximum. tca9517 delay 3ms is 3,000,000ns. 3ms is 12,000 times the max propagation delay.

I'd have to see your circuit and a oscilloscope waveform to verify where it is a hardware issue on the bus or a software priority problem.

Do you have a 4 channel 'scope? if you do, you could insert the drivers, and physically arrange the layout so that both ends can be connected to the 'scope. That would allow you to measure the delay.

As long are you are not seeing GROSS timeout errors, the i2c driver will transparently handle any of these short delays. As long as the occasional 3ms code hang doesn't cause your app problems you can ignore them.

Chuck.

chinswain commented 6 years ago

Sorry for the delay in responding Chuck.

Only a 2 channel scope at the moment, I don't think it's device related as it happens on both a ADS1115 and BME280, I've stripped the project down completely and replaced the breadboard and MCU (New LOLIN D32) and installed the latest code from the main repository (plus your modified files). I've also dropped the P82B715 for now as there are better options once I have this stable.

Using a minimal sketch to query any device is fine on the breadboard as expected, the change made that causes the regular timeouts is enabling WiFi - Which would normally lead me to think it's a power issue but I've added a 3A external source to the MCU and no change. This may be due to a recent change in the main repository (I didn't test before updating).

Are these timeouts the same as previously?

Debug Messages:

E (105) spiram: SPI RAM enabled but initialization failed. Bailing out.

[D][esp32-hal-i2c.c:1166] i2cProcQueue(): used TimeoutRecovery: expected=0ms, actual=2ms, configured=50ms 
[E][esp32-hal-i2c.c:630] i2cDumpI2c(): i2c=0x3ffc1114
[E][esp32-hal-i2c.c:641] i2cDumpI2c(): dev=0x60013000 date=0x16042000 level=DEBUG
[E][esp32-hal-i2c.c:643] i2cDumpI2c(): lock=0x3ffb239c
[E][esp32-hal-i2c.c:645] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:646] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:647] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:648] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:649] i2cDumpI2c(): event=0x3ffb9a4c bits=10
[E][esp32-hal-i2c.c:650] i2cDumpI2c(): intr_handle=0x3ffb4ad0
[E][esp32-hal-i2c.c:651] i2cDumpI2c(): dq=0x3ffb9694
[E][esp32-hal-i2c.c:652] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:653] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:654] i2cDumpI2c(): byteCnt=2
[E][esp32-hal-i2c.c:604] i2cDumpDqData(): [0] 92 W STOP buf@=0x3ffc3d22, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:620] i2cDumpDqData(): 0x0000: .                                01 
[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x00005973
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x00005973
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0002 0x0040 0x0000 0x0000 0x00005974
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0000 0x00005974

[D][esp32-hal-i2c.c:1166] i2cProcQueue(): used TimeoutRecovery: expected=0ms, actual=2ms, configured=50ms 
[E][esp32-hal-i2c.c:630] i2cDumpI2c(): i2c=0x3ffc1114
[E][esp32-hal-i2c.c:641] i2cDumpI2c(): dev=0x60013000 date=0x16042000 level=DEBUG
[E][esp32-hal-i2c.c:643] i2cDumpI2c(): lock=0x3ffb239c
[E][esp32-hal-i2c.c:645] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:646] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:647] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:648] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:649] i2cDumpI2c(): event=0x3ffb9a4c bits=10
[E][esp32-hal-i2c.c:650] i2cDumpI2c(): intr_handle=0x3ffb4ad0
[E][esp32-hal-i2c.c:651] i2cDumpI2c(): dq=0x3ffb9694
[E][esp32-hal-i2c.c:652] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:653] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:654] i2cDumpI2c(): byteCnt=2
[E][esp32-hal-i2c.c:604] i2cDumpDqData(): [0] 92 W STOP buf@=0x3ffc3d22, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:620] i2cDumpDqData(): 0x0000: .                                01 
[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x0000aca6
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x0000aca6
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0002 0x0040 0x0000 0x0000 0x0000aca7
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0000 0x0000aca7

[D][esp32-hal-i2c.c:1166] i2cProcQueue(): used TimeoutRecovery: expected=0ms, actual=2ms, configured=50ms 
[E][esp32-hal-i2c.c:630] i2cDumpI2c(): i2c=0x3ffc1114
[E][esp32-hal-i2c.c:641] i2cDumpI2c(): dev=0x60013000 date=0x16042000 level=DEBUG
[E][esp32-hal-i2c.c:643] i2cDumpI2c(): lock=0x3ffb239c
[E][esp32-hal-i2c.c:645] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:646] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:647] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:648] i2cDumpI2c(): error=1
[E][esp32-hal-i2c.c:649] i2cDumpI2c(): event=0x3ffb9a4c bits=10
[E][esp32-hal-i2c.c:650] i2cDumpI2c(): intr_handle=0x3ffb4ad0
[E][esp32-hal-i2c.c:651] i2cDumpI2c(): dq=0x3ffb9694
[E][esp32-hal-i2c.c:652] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:653] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:654] i2cDumpI2c(): byteCnt=2
[E][esp32-hal-i2c.c:604] i2cDumpDqData(): [0] 92 W STOP buf@=0x3ffc3d22, len=1, pos=1, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:620] i2cDumpDqData(): 0x0000: .                                01 
[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0002 0x0000 0x0000d4a6
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x0000d4a6
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0002 0x0040 0x0000 0x0000 0x0000d4a7
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [04] 0x0001 0x0080 0x0000 0x0000 0x0000d4a7

Sketch (Using https://github.com/adafruit/Adafruit_BME280_Library):


#include <Wire.h>
#include <Adafruit_ADS1015.h>
#include <WiFi.h>
#include <WiFiClient.h>

Adafruit_ADS1015 ads;

void setup(void)
{
  Serial.begin(115200);
  ads.begin();

  WiFi.begin("******", "******");

  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
  }

  delay(100);
}

void loop(void)
{
  Serial.print("AIN0: "); Serial.println(ads.readADC_SingleEnded(0));
  delay(1000);
}

Sketch (Using https://github.com/finitespace/BME280):


#include <BME280I2C.h>
#include <Wire.h>
#include <WiFi.h>
#include <WiFiClient.h>
#define SERIAL_BAUD 115200

BME280I2C bme;

void setup()
{
  Serial.begin(SERIAL_BAUD);

  while (!Serial) {}

  Wire.begin();

  while (!bme.begin())
  {
    Serial.println("Could not find BME280 sensor!");
    delay(1000);
  }

  WiFi.begin("********", "********");

  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
  }
  delay(200);

}

void loop()
{

  float temp(NAN), hum(NAN), pres(NAN);
  BME280::TempUnit tempUnit(BME280::TempUnit_Celsius);
  BME280::PresUnit presUnit(BME280::PresUnit_Pa);

  bme.read(pres, temp, hum, tempUnit, presUnit);

  Serial.print("Temp: ");
  Serial.print(temp);
  Serial.print("°" + String(tempUnit == BME280::TempUnit_Celsius ? 'C' : 'F'));
  Serial.print("\t\tHumidity: ");
  Serial.print(hum);
  Serial.print("% RH");
  Serial.print("\t\tPressure: ");
  Serial.print(pres);
  Serial.println("Pa");
  delay(100);
}
stickbreaker commented 6 years ago

@chinswain The main branch of Espressif-Arduino now contains my i2c fix,, there is no need to overwrite the main branch files with my files.

stickbreaker commented 6 years ago

@chinswain Those messages are just showing up because you have the Core Debug Level at DEBUG. They are showing my files are correctly handling a SCL stretching event. It is not an 'error' just debug messages proving my i2c files can correctly handle a SCL stretching event.

[D][esp32-hal-i2c.c:1166] i2cProcQueue(): used TimeoutRecovery: expected=0ms, actual=2ms, configured=50ms [E][esp32-hal-i2c.c:630] i2cDumpI2c(): i2c=0x3ffc1114 [E][esp32-hal-i2c.c:641] i2cDumpI2c(): dev=0x60013000 date=0x16042000 level=DEBUG

The [E][esp32-hal-i2c.c] show up because the Core debug level is at DEBUG and the logging message are displayed if the Core Debug level is great that or Equal to their level.

from cores/esp32/esp32-hal-log.h:


#define ARDUHAL_LOG_LEVEL_NONE       (0)
#define ARDUHAL_LOG_LEVEL_ERROR      (1)
#define ARDUHAL_LOG_LEVEL_WARN       (2)
#define ARDUHAL_LOG_LEVEL_INFO       (3)
#define ARDUHAL_LOG_LEVEL_DEBUG      (4)
#define ARDUHAL_LOG_LEVEL_VERBOSE    (5)

Since your current Core Debug Level is DEBUG, all ERROR, WARNING, INFO, and DEBUG will appear.

I would recommend you reduce the Core Debug level to ERROR. That way only detected problems will be displayed.

Also, use the Main Espressif-Arduino without my files. They are no longer necessary, because they have been incorporated into the Main repo.

Chuck.

chinswain commented 6 years ago

Great! - will new changes be done here first or all over there now?

stickbreaker commented 6 years ago

@chinswain you should use espressif-arduino. When I get SLAVEMODE functioning, I would post hit here. So, this fork is now effectively dead.

Chuck.

chinswain commented 6 years ago

Me again...

I've disabled debug, but I'm getting issues with timings with more than one i2c device on the bus. Enabling debug with all devices attached is giving 2ms timeouts consistently every 5 seconds or so per device which I guess is hammering your recovery routine. With WiFi off, no issue at all, no time outs and devices respond with no delay.

With this old fork it works perfectly with all devices on the bus, regardless of WiFi being enabled or not. (I've wiped the hardware folder and installed your fork, replacing esptool.exe with version 2.0-dev as the new 2.1 is not comparable with your fork).

So looks like a recent change in the main repository is causing an issue. I've tested a few ESP32 devices and the issue (and resolution) are constant.

The ADS1115 does not perform clock-stretching according to the datasheet, erroneous error from some WiFi code change affecting I2C timing?

stickbreaker commented 6 years ago

@chinswain Are you seeing 'timeout Recovery' or 'gross timeout'? 'timeout Recovery' should be transparent, (behind the scenes). You would only get notification if you have 'DEBUG' level selected.

What are your timing 'issues'?

If WiFi is delaying the i2c ISR interrupts, as long as the any delay is less than the hardware timeout (13.2ms) or the Tx,Rx fifo's don't underFlow, overFlow it should not cause a problem.

What problems are you seeing?

Chuck.

chinswain commented 6 years ago

@stickbreaker I'm using the commit from the main repo made on the 6th of April (1cf4270) which works perfectly so I can stick with that if no one else is experiencing any issues.

I've downloaded and setup every commit going back a few months (patching with your latest i2c files), everything after commit 69f72ec causes the timeouts, everything before is fine.

I can suppress the timeouts by changing from debug as you say but I'm noticing hourly wifi disconnects and missed pulse reads on a MCP23017.

Debug Messages:

[E][esp32-hal-i2c.c:641] i2cDumpI2c(): dev=0x60013000 date=0x16042000 level=DEBUG
[E][esp32-hal-i2c.c:643] i2cDumpI2c(): lock=0x3ffda670
[E][esp32-hal-i2c.c:645] i2cDumpI2c(): num=0
[E][esp32-hal-i2c.c:646] i2cDumpI2c(): mode=1
[E][esp32-hal-i2c.c:647] i2cDumpI2c(): stage=3
[E][esp32-hal-i2c.c:648] i2cDumpI2c(): error=5
[E][esp32-hal-i2c.c:649] i2cDumpI2c(): event=0x3ffda6d0 bits=112
[E][esp32-hal-i2c.c:650] i2cDumpI2c(): intr_handle=0x3ffda208
[E][esp32-hal-i2c.c:651] i2cDumpI2c(): dq=0x3ffda2d4
[E][esp32-hal-i2c.c:652] i2cDumpI2c(): queueCount=1
[E][esp32-hal-i2c.c:653] i2cDumpI2c(): queuePos=0
[E][esp32-hal-i2c.c:654] i2cDumpI2c(): byteCnt=0
[E][esp32-hal-i2c.c:604] i2cDumpDqData(): [0] 92 W STOP buf@=0x3ffc3402, len=3, pos=3, eventH=0x0 bits=0
[E][esp32-hal-i2c.c:620] i2cDumpDqData(): 0x0000: ...                              01 c1 83 
[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row  count   INTR    TX     RX
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0004 0x0000 0x0009ac6f
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x0009ac6f
[E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0001 0x0020 0x0000 0x0000 0x0009ac6f

[I][esp32-hal-i2c.c:1002] i2cProcQueue(): Bus busy, reinit

Unfortunately the commit was quite large, including an IDF and epstool update, I can't see anything obvious that would cause issues to I2c when WiFi is enabled.

https://github.com/espressif/arduino-esp32/commit/69f72eca849a44688e2da38028eb9aa707e91f08

stickbreaker commented 6 years ago

@chinswain This error message is showing an arbitration loss

[E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0001 0x0020 0x0000 0x0000 0x0009ac6f

Do you have a complex i2c bus with multiple masters? If not, the i2c peripheral detected a mismatch between what it drove SDA and what is readback. An Arbitration failure is detected when the SDA is LOW when it is should be HIGH. This could be cause by a weak pullup or some type of induced glitch.

What is your electrical circuit? (pullups, bus length, organization(star, bus), logic level converters) What gpio's are you using? What is the bus speed? (default 100kHz)

Chuck.

stickbreaker commented 6 years ago

@chinswain with a closer look, the Arbitration loss happened during the Device Id output. you were doing a 3 byte write to device 0x49 of (0x01 0xC1 0x83).

[E][esp32-hal-i2c.c:968] i2cDumpInts(): 0 row count INTR TX RX [E][esp32-hal-i2c.c:971] i2cDumpInts(): [01] 0x0001 0x0002 0x0004 0x0000 0x0009ac6f [E][esp32-hal-i2c.c:971] i2cDumpInts(): [02] 0x0001 0x0200 0x0000 0x0000 0x0009ac6f [E][esp32-hal-i2c.c:971] i2cDumpInts(): [03] 0x0001 0x0020 0x0000 0x0000 0x0009ac6f

[I][esp32-hal-i2c.c:1002] i2cProcQueue(): Bus busy, reinit

This shows 4 bytes loaded[01] for output(1 id + 3 data), as soon as the peripheral was started [02], the peripheral detected an arbitration fault [03]. game over from there.

After the arbitration failure the i2c bus was still in a busy state, the next Wire() call saw the bus failure and went through a repair cycle. I can't see if the repair succeeded, (log was cutoff)

Chuck.

chinswain commented 6 years ago

Hi Chuck, Not now - it's super simple for testing.

To simplify things for testing this issue I have BME280 and ADS1115 breakout boards (MCP23017 removed) with pull ups removed, connected to the default pins (I tried others) 21 & 22 on a LOLIN D32 board. This is on a breadboard with short jumpers and 2.4K pull-ups.

It's running at 100kHz but I've tried other speeds, 400kHz seems to reduce the frequency of issues (especially WiFi disconnects) and a low speed of 10000Hz has the same affect.

No converters and a simple bus. I've left it overnight with https://github.com/espressif/arduino-esp32/commit/1cf42702dd9fb53664844e1019cd1a25c23ca757 and no timeouts or any WiFi disconnection.

I don't think it's worth spending much time on unless others report issues.