espressif / arduino-esp32

Arduino core for the ESP32
GNU Lesser General Public License v2.1
13.32k stars 7.36k forks source link

[ESP32] I2C Interrupt problem since uprade to Espressif 1.1.0 #1588

Closed cyberman54 closed 5 years ago

cyberman54 commented 6 years ago

I'm using u8x8 on ESP32 boards, programmed with Arduino Espressif32 core. I2C display handling using driver U8X8_SSD1306_128X64_NONAME_HW_I2C

This worked perfectly until last days the Espressif32 core was updated from v1.0.2 to v1.1.0.

Since then i get this error during runtime: [E][esp32-hal-i2c.c:594] i2c_isr_handler_default(): eject int=0x0, ena=0x0

And the display write speed is significantly reduced.

I'm not sure if this is a problem with u8x8 or the Espressif32 core. I'm not using any other I2C communication on the board.

Cross reference to u8x8: https://github.com/olikraus/u8g2/issues/646#issuecomment-402372457

cyberman54 commented 6 years ago

Looks like this issue for me is indeed a side effect from versioning in platformio:

https://github.com/platformio/platform-espressif32/commit/a64488026cca484ec2f2210a2476cfd24d4a9772

The arduino version was updated from 1.5.3 to 1.6.0 on July2, 2018. So this could be the reason, why i ran in that i2c speed issue with u8g2.

Now we need to find the root cause of that speed issue.

stickbreaker commented 6 years ago

@cyberman54 @bartoszbielawski Definitely a clock speed issue.

stickbreaker commented 6 years ago

@cyberman54 @bartoszbielawski I changed the pattern displayed on my screen, I fill the screen with the count, once the screen is no longer changing it does not generate the error.


void displayThread(void*)
{
    Wire.begin(OLED_SDA,OLED_SCL,100000);
    Adafruit_SSD1306 display(OLED_RST);
    display.begin(SSD1306_SWITCHCAPVCC, 0x3C); // initialize with the I2C addr 0x3D (for the 128x64)
    display.setTextColor(WHITE);
    display.setTextSize(1);
    display.setFont();
    int x = 0;
    uint32_t speed=100000, newSpeed=0, oldSpeed;
    oldSpeed = speed;
    display.clearDisplay();
    display.setCursor(0, 0);
    while (true)
    {
//        display.clearDisplay();
//        display.setCursor(50, 50);
        display.printf("%d ", x++);
        display.display();
        if((x & 0x0f)==0)Serial.printf("%d\n", x);
        while(Serial.available()){// adjust speed
          char ch = Serial.read();
          if (ch=='+') newSpeed = oldSpeed + 25000;
          if (ch=='-') newSpeed = oldSpeed - 25000;
          if((ch>='0')&&(ch<='9')) {
            newSpeed *=10;
            newSpeed += ch-48;
          }
          if(ch =='\n'){
            speed=newSpeed;
            newSpeed= 0;
            }
          if(speed != oldSpeed){
            Wire.setClock(speed);
            Serial.printf("speed=%d ",speed);
            oldSpeed = speed;
            display.clearDisplay();
            display.setCursor(0, 0);

          }
        }
        delay(2);
    }
}

void setup() 
{
    Serial.begin(115200);
    xTaskCreate(displayThread, "DT", 8192, NULL, 5, NULL);    
}

void loop() {
    delay(1);
}

This could be a POWER issue.

I have to setup a breadboard and plugin my scope.

Try this code. you can change the clock rate from the serial monitor.

Chuck.

cyberman54 commented 6 years ago

Allright, for my app it is no problem to go back to 100khz. I need to find out how can i do it with u8g2.

I never knew that it speeded up with the lastet Espressif32 upgrade. What an crazy side effect...

stickbreaker commented 6 years ago

@cyberman54 Well, I seem to have hosed this display, my viewport is only the first 8 rows, I'm going to dig in to the spec sheet and go through all of the init commands. The other seven rows of the screen had a corrupted image, then I powered it off for 5 minutes now it just has a random dot pattern.

It is doing some kind of horizontal scroll instead of incrementing to next line.

I can still make it create the 'eject' error so it is still usable for this problem.

Chuck.

stickbreaker commented 6 years ago

@cyberman54 Recovered the display, the AdaFruit library does not completely configure it. I have reduced the occurrence of 'eject' by increasing the fifo thresholds with increased clock rate.
The board is unstable with clock>400k

Tomorrow I'll look at the electrical signals, I'm thinking we are also fighting reflections and weak pullups.

I just posted a pr #1625 that contains my changes so far. It is a work in progress

Chuck.

stickbreaker commented 6 years ago

@cyberman54 @bartoszbielawski Here is my modified test code, It is a hack. HelTec test

Chuck.

stickbreaker commented 6 years ago

@helmut64 me-no-dev merged in my interrupt code #1539 Jun 27.

This problem is overactive debug message. The underlying spurious interrupts can be handled by ignoring them. I am working on adjusting the driver so that it does not create circumstances where spurious interrupts are generated.

There is one major difference between my code and the original code: Any ReSTART operation will be queued until the next STOP operation:

Wire.beginTransmission(i2caddr);
Wire.write(highbyte); // setup read addres for EEPROM
Wire.write(lowbyte);
uint8_t err = Wire.endTransmission(false);  //ReSTART
if(err!=0 ){
   Serial.printf(\"Error setting read addess in EEPROM =%d (%s)\n", Wire.lastError(),Wire.getErrorText(Wire.lastError()));
// err will always return 7 in this circumstance to identify this operation is queued (I2C_ERROR_CONTINUE)
} else {
  Wire.requestFrom(i2caddr,10); 
  while(Wire.available()){
    Serial.printf(\"0x02x \",Wire.read());
    }
  Serial.println();
  }

So, this code will not correctly display the EEPROM contents the (err != 0) needs to be (err !=7). Any ReSTART operation will return 7, so any code that uses ReSTART needs to be adjusted.

Chuck.

stickbreaker commented 6 years ago

Wow, @helmut64 your comments disappeared?

Chuck.

stickbreaker commented 6 years ago

This is getting weirder and Weirder! The number of 'eject' occurrences is varies with clockrate, the Pullups.

With the Bare HelTec WiFi Lora 32, the OLED does not function correctly with clock rates above 325khz, if I add 3.3k pullups, I can get reliable communications at 425khz. Above that the the display fritz's out.

When the display starts fritzin, I start seeing the 'eject' errors. I can't tell which one is comes first.

With and optimized ssd1306 library, I can get 40 frames per second update to the screen.

I think these interrupts are caused by glitches on the bus. I think the OLED is doing something. I haven't figured out how to catch these glitches, In my latest run at 430khz with 3.3k pullups, I had an error at refresh 9888, it is currently at 34800 with not errors between. Roughly 26.155MB of data without any problems.

Chuck.

cyberman54 commented 6 years ago

@stickbreaker i don't think it's the OLED, because i see the same effect on other boards (TTGO) which have similar OLED, but from different manufacturer.

helmut64 commented 6 years ago

@stickbreaker I have not done anything, I don't know why my comment is gone, grade? Maybe I am not allowed to have our radio shuttle link here, strange. Anyways: Of course I know the ESP32 Heltec LoRa very well, I had no display problems so far. We can talk about this off-topic when you like. I will investigate into the I2C problem by simply switching back the ESP32 Arduino releases until it works again. I reviewed the checkins but cannot find possible errors. I keep you updated. Helmut

cyberman54 commented 6 years ago

@stickbreaker I can confirm: after setting the i2c speed to 100kHz and updating espressif32 core 1.1.x to commit 28a410d which includes pr #1625 i don't get any more interrupt injections. But the OLED speed now is signifant slower as it was before, with espressif32 core 1.0.2 @100 kHz.

In verbose log level mode the only log message i get from the i2c subsystem is

[V][esp32-hal-i2c.c:1362] i2cSetFrequency(): threshold=3

What's the meaning of this?

cyberman54 commented 6 years ago

@stickbreaker i checked the specs of a SSD1306 display, it says 2,5us is minimum i2c clock cycle the display can handle. So, 400khz is maximum for SSD1306 operated with i2c interface.

cyberman54 commented 6 years ago

@stickbreaker comparing my application code running on espressif32 core v1.0.2 with latest v1.1.x including your pr, i studied that the arduino main loop task seems to run much slower with v1.1.x than 1.0.2. The main loop of my app is a state machine, which besides other tasks blinks a LED. When the code is compiled with v1.0.2 the LED flashes quickly in a short cycle as i programmed the timer. When i compile it with v1.1.x LED flashing is significantly slowed down, maybe by 5 to 10 times.

It looks like for some reason the code processing speed went down. I assume that this is a starving effect caused by another RTos task which consumes much more cpu time under v1.1.x that v1.0.2.

Maybe the same root cause affects i2c timing. Perhaps the i2c issues discussed here are a side effect?

stickbreaker commented 6 years ago

@cyberman54 the threshold is the trip points for tx and rx fifo interrupts. I added code to adjust the fifo full and fifo empty points. Prior it was fixed. Inside i2cSetFrequency() it calculates where to set trip points. I was noticing problems while debugging. (Too many log messages) causing tx glitches) I had tx threshold set to 2.

I am working on reducing interrupt calls, maybe my i2c is part of the preformance hit.

Do any of you use AdaFruit's SSD1306 library?

I made a modified version that increases the display rate. I'll add it to my repo if anyone is interested.

Chuck.

cyberman54 commented 6 years ago

@stickbreaker currently i'm using u8g2 because of it's lightweight footprint of u8x8 class, since my app does not need graphics, just plain text.

bartoszbielawski commented 6 years ago

@stickbreaker Using your copy of the library at 100 kHz works fine - but it was fine before as well. Going to 400 kHz makes the eject message to appear sometimes but not very often (< 1/s). I should probably test it with the original test program instead of my code...

CelliesProjects commented 6 years ago

@stickbreaker Hi Chuck, I am interested in that modified SSD1306 lib.

stickbreaker commented 6 years ago

@CelliesProjects it's up. at SSD1306 library repo it is just a replacement for AdaFruit_SSD1306, it needs AdaFruit_GFX to function.

Chuck.

CelliesProjects commented 6 years ago

@stickbreaker Thanks. I will test the lib somewhere this week. Will keep you posted.

stickbreaker commented 6 years ago

@cyberman54 @bartoszbielawski I think I fixed it. I changed the i2c interrupt driver to no longer need an interrupt for each byte move. This has reduced the interrupt load dramatically.

Now I only need to service START, STOP, FIFO Fills, and errors.

I have my modified SSD1306 library updating the screen in a blur, I am seeing 259f/s at 1,000,000hz. EDIT: I just realized my quoted f/s is not completely accurate. I modifed SSD1306 to only send the modified bits to the screen, so it is not completely updating the screen at every update, only the changed rectangle.

This is with an 18" i2c bus with 6 24LCxx EEPROMS (with 3.4k pullups) and the SSD1306 OLED(10k pullups).

The signal is an ugly sawtooth waveform, with crosstalk.

i2c1mhz

But, it works, On this test sequence I have ran 389,000 screen updates at 1MHz with NO 'eject' errors, I only saw 4 NAK on a frame update.

I need to clean up my debug output. Should have this update as a pull request by the week end! Yea!

Chuck.

stickbreaker commented 6 years ago

Decreasing my debug output caused the screen to wig out, at 1MHz, the I2C subsystem is not reporting any errors, but, the display is now spazing out at 900kHz. They only rate it at 400kHz so, anything above that is a bonus!

Chuck.

me-no-dev commented 6 years ago

@stickbreaker ping me on Gitter please :)

cyberman54 commented 6 years ago

@stickbreaker thanks for your effort! Sounds like next week i can migrate my code to Espressif32 core v1.1.x 😀

helmut64 commented 6 years ago

@stickbreaker Hi Chuck, I investigated into the problem that I2C does not work anymore for my devices (e.g.: Adafruit Si7021, and Rodan DS3231). This are very basic devices, I tried to change the wire speed to 100 kHz, no luck.

Up to the 24th of June (Git 7abd586) it works great. Starting with the IDF update 27-June (Git a59eafb) it fails.

You did there a lot of I2C changes, unfortunately my gdb ESP32 setup does not work jet on my Mac, how I can help to fix this?

Regards Helmut

stickbreaker commented 6 years ago

@helmut64 there is one change that may be cause you problems, look through your code for a ReSTART operation:

Wire.endTransmission(false);
// or
Wire.requestFrom(id,len,false);

The ESP32 cannot support a naked ReSTART, the hardware has to have a complete i2c transaction or it goes into a timeout cascade. Complete i2c transactions are bounded by START and STOP. ReSTART operations exist inside. START -> ReSTART -> STOP

This requires that any Wire() operations that specifies a ReSTART has to be queued until an operation with a STOP is issued. This causes problem with Wire.endTransmission(false) returning an accurate error code, since the operation is queue for later execution it cannot return success (0) as its result. So, to indicate that the operation is queued, Wire.endTransmission(false) always returns 7(I2C_ERROR_CONTINUE). to indicate the operation is pending.

standard Arduino code for using ReStart is similar to this:

Wire.beginTransmission(id);
Wire.write(registerAddress);
uint8_t err =Wire.endTransmission(false); // ReSTART
if(err != 0){// endTransmission failed
  // bad stuff
}
else { //success
  err= Wire.requestFrom(id,10);
  while(Wire.available()){
    Serial.printf("0x%02 ",Wire.read());
  }
Serial.println();
}

to be compatible with the ESP32, the if() if(err != 0) need to be changed to if(err !=7) because of the ReSTART

Chuck.

stickbreaker commented 6 years ago

@cyberman54 @bartoszbielawski Just posted my updated i2c.c as a pull request #1665. Grab it and replace esp32-hal-i2c.c in any of the rc's RC1,RC2, or RC3.

Chuck.

helmut64 commented 6 years ago

Hi Chuck, I have seen in the Adafruit_Si7021 multiple Wire.endTransmission(false), without a reason for the false. I removed the false parameter and the I2C works again.

I will continue testing with your patch.

helmut64 commented 6 years ago

Update: The new Pull request #1665 does not work with the Adafruit_Si7021 I2C driver (no Wire.endTransmission false used). Without this patch it works (without the Wire.endTransmission false parameter).

stickbreaker commented 6 years ago

@helmut64 are you seeing any error messages? Post a link to the library you are using. I'll look through it.

Chuck.

helmut64 commented 6 years ago

The Adafruit_Si7021 is very simple, the code is here: https://github.com/adafruit/Adafruit_Si7021 PS: When I switch your code to Debug-Lebel Verbose there are problems in the log_e code and it does not compile. Helmut

stickbreaker commented 6 years ago

@helmut64 Post the compile errors, I can't fix what I can't see.

Looking through that si7021 library, I can see why it doesn't work. Original AdaFruit code

float Adafruit_Si7021::readHumidity(void) {
  Wire.beginTransmission(_i2caddr);
  Wire.write((uint8_t)SI7021_MEASRH_NOHOLD_CMD);
  Wire.endTransmission(false);
  delay(25);

  Wire.requestFrom(_i2caddr, 3);
  uint16_t hum = Wire.read();
  hum <<= 8;
  hum |= Wire.read();
  uint8_t chxsum = Wire.read();

  float humidity = hum;
  humidity *= 125;
  humidity /= 65536;
  humidity -= 6;

  return humidity;
}

This code will not react as assumed. The i2c subsystem will queue the Wire.endTransmission() delay for 25ms then do the Wire.requestFrom() which will return a NAK, without any data being transferred.

  Wire.endTransmission(false);
  bool done = false;
  uint32_t timeout = millis();
  while((millis()-timeout<1000)&&( !done)){ // 1sec timeout
    done = (Wire.requestFrom(_i2caddr, 3) ==3);
    delay(2); // give other tasks a bit of time while sensor samples
  }
  if (done){
    uint16_t hum = Wire.read();
    hum <<= 8;
    hum |= Wire.read();
    uint8_t chxsum = Wire.read();

    float humidity = hum;
    humidity *= 125;
    humidity /= 65536;
    humidity -= 6;

    return humidity;
  }
  else { // error
    return -1.0; // error    
  }
}

This is only one problem, there are more, when I get a chance I'll go through the rest of the library.

Chuck.

helmut64 commented 6 years ago

The Compiler problem you see only when you switch within the Arduino IDE the menu the "Core Debug-Level" level to Verbose and click on compile.

stickbreaker commented 6 years ago

@helmut64 It compiles without any errors on my system using Arduino 1.8.5, Core Debug Level NONE -> VERBOSE .

I'm not seeing any problems, capture the Arduino's compile window. Without it I can't do anything.

Chuck.

stickbreaker commented 6 years ago

@helmut64 I modified Adafruit's si7021 library here. It should now work with esp32. It compiles, but I don't have one of these sensors, so it may not actually function. Try it an report any problems.

Chuck.

stickbreaker commented 6 years ago

@cyberman54 If this fix works for you, close this issue. me-no-dev just merged it into the main repo. So, lets see if any new errors crop up.

Chuck.

cyberman54 commented 6 years ago

@stickbreaker meanwhile i did some tests with my app, comparing running it unter Espressif 1.0.2 and Espressif 1.1.2 including latest pull requests, also yours. Result is:

So, result is that i will continue sticking to Espressif core 1.0.2, until i get rid this performance hit :-(

cyberman54 commented 6 years ago

@stickbreaker i did another test and took out the display driver from my app, so that no i2c communication is used. Result is, that the performance issues disappear, i.e. LED blinking is back to normal. This may be a hint that there is still a problem in the i2c subsystem, maybe some sort of interrupt cascade. But i am far away from being sure, because i have only very limited debug options yet.

stickbreaker commented 6 years ago

@cyberman54 can you post your display driver code? I need something to test.

ssd1306_128x64_i2c.zip SSD1306.zip

The SSD1306.zip is a modified AdaFruit_SSD1306 that only sends changed bits to the display. The ssd1306_128x64_i2c.zip has some performance monitoring in it. It needs Adafruit_GFX, and AdaFruit_SSD1306. You can change the comments to switch to my SSD1306 to see the difference.

One of the tests "testStar()" takes 3610ms using Adafruit_SSD1306 only 865ms with SSD1306.

Have you went through your U8X8_SSD1306_128X64_NONAME_HW_I2C verifying that it handles ReSTARTS correctly? The esp32 hardware handles ReSTARTS differently than AVR (standard Arduino). Possibly the library is having issues.

Chuck.

cyberman54 commented 6 years ago

@stickbreaker i'm using u8x8.cpp from the u8g2 library. Did not write any display driver code on my own.

stickbreaker commented 6 years ago

@cyberman54 can you give me a basic sketch that uses this library? Looking through this code is a nightmare. I need a place to start.

Chuck.

helmut64 commented 6 years ago

The Updated SI7021 Sensor library works, I understand your while loop around the Wire.requestFrom(). However I believe it still has problems:

Thank you for helping out here. Helmut

stickbreaker commented 6 years ago

@helmut64 my timing was taken from here: si7021_timing And the NAK protocol from here: si7021_nak

I don't know what you meant here:

The read sensor moden and revision function is gone. (API incompatibility).

I don't have one of these sensors, so I am basing my code on my understanding of the datasheet. I guess, if my code doesn't fit your purpose you'll have to write better code :grinning:

If you have a library that produces accurate values, just modify it to work with the esp32.

Chuck.

helmut64 commented 6 years ago

Chuck, you do a great job, and I love to see better I2C support, thank you for your support.

I was talking about the missing functions:

Adafruit_Si7021::readRevision and the variable “model”.

I will do further testing and integrate your changes.

You need a set of our ESP32 ECO Power boards with the sensor, please contact me (via the radioshuttle.de website) with your contact details. Helmut

stickbreaker commented 6 years ago

@helmut64 just tried the contact link from your website

This message was created automatically by the mail system (ecelerity).

A message that you sent could not be delivered to one or more of its recipients. This is a permanent error. The following address(es) failed:

XXXXXXXX@radioshuttle.de (after RCPT TO): 550 5.1.1 XXXXXXXX@radioshuttle.de: Recipient address rejected: >User unknown in virtual alias table

substituted "XXXXXXXX" to protect you address

Chuck.

helmut64 commented 6 years ago

@stickbreaker lora@radioshuttle.de should work.

helmut64 commented 6 years ago

Hi Chuck, I have another problem with I2C using the Rodan DS3231 RTC driver. The driver is not great, no C++, however it works with D21 MCUs and well with AVRs, it was working with the ESP32 until your patch. https://github.com/helmut64/ds3231/tree/ESP_ECO_POWER_RTC

There is no endTransmission(false) in this, I plan to use a scope to trace what is going on. Maybe you see something and can point to it. Regards Helmut

stickbreaker commented 6 years ago

@helmut64 that is the email i used. I'll try it again.

On the Rtc, I'll look at it tomorrow or saturday. Today was 20+ hrs I'm blitzed.

Chuck

stickbreaker commented 6 years ago

@helmut64 got it.