adafruit / circuitpython

CircuitPython - a Python implementation for teaching coding with microcontrollers
https://circuitpython.org
Other
4.08k stars 1.2k forks source link

busio.I2C issues on ESP32-S3: LC709203F #6311

Closed kattni closed 11 months ago

kattni commented 2 years ago

CircuitPython version

Adafruit CircuitPython 7.3.0-beta.1 on 2022-04-07; Adafruit Feather ESP32S2 with ESP32S2

Code/REPL

import time
import board
import bitbangio
from adafruit_lc709203f import LC709203F, PackSize

i2c = bitbangio.I2C(board.SCL, board.SDA, frequency=250000, timeout=1000000)
battery_monitor = LC709203F(i2c)

battery_monitor.pack_size = PackSize.MAH400

while True:
    print("Battery Percent: {:.2f} %".format(battery_monitor.cell_percent))
    print("Battery Voltage: {:.2f} V".format(battery_monitor.cell_voltage))
    time.sleep(2)

Behavior

Fails on line 9 with the following:

code.py output:
Traceback (most recent call last):
  File "code.py", line 9, in <module>
  File "adafruit_lc709203f.py", line 119, in __init__
  File "adafruit_lc709203f.py", line 164, in power_mode
  File "adafruit_lc709203f.py", line 247, in _write_word
OSError: [Errno 5] Input/output error

Description

frequency and timeout have no effect on the behavior. It was already crashing consistently after a short period of time with busio and board.I2C(), or straight up not seeing the chip (even though I2C scan sees it). So we wanted to try bitbang I2C. bitbangio crashed more quickly than busio did.

After chatting with @dhalbert about it, as per request, I am filing a bug here.

Additional information

No response

mrdalgaard commented 2 years ago

Could this be related? https://github.com/adafruit/Adafruit_CircuitPython_LC709203F/issues/3 Perhaps try bitbangio with a very low frequency to see if the result is the same as in above bug

dhalbert commented 2 years ago

I cannot reproduce this with busio.I2C() on a Feather ESP32-S2, either with 7.3.0-beta.1 or the pending https://github.com/adafruit/circuitpython/pull/6366 I2C fix. busio works fine. It's true that bitbangio gives the error above. I have been running for several minutes during each trial. I tested with only USB, with USB and an attached battery, and with a battery only, sending output to the default UART.

import time
import bitbangio
import board
import busio
from adafruit_lc709203f import LC709203F, PackSize

uart = board.UART()
battery_monitor = LC709203F(board.I2C())

battery_monitor.pack_size = PackSize.MAH400

while True:
    print("Battery Percent: {:.2f} %".format(battery_monitor.cell_percent))
    print("Battery Voltage: {:.2f} V".format(battery_monitor.cell_voltage))

    uart.write(b"Battery Percent: {:.2f} %\r\n".format(battery_monitor.cell_percent))
    uart.write(b"Battery Voltage: {:.2f} V\r\n".format(battery_monitor.cell_voltage))

    time.sleep(2)
dhalbert commented 2 years ago

Tested further. These all work with #6366. Did not try with 7.3.0-beta.1

ESP32-S3 Feather works sometimes for a while at 10kHz I2C clock. At 100kHz, seems to fail immediately. Saleae trace with working reads, and then last is failure: Feather-ESP32-S3-LC709203F-10kHz.sal.zip

dhalbert commented 2 years ago

We are also seeing issues with MCP960x, which does clock stretching and repeated start. Before #6366, I see Input/output error, as above. After #6366, I am more likely to see the sensor not even appear in a scan.

EDIT: MCP960x problem is a sensor problem, not related to this. Fixed for MCP960x by https://github.com/adafruit/Adafruit_CircuitPython_MCP9600/pull/20.

dhalbert commented 2 years ago

For LC709203F, I am seeing bad handling of a clock stretch. The top trace is ESP-S3, the bottom trace is ESP-S2. The first four I2C transactions are the same. Then there is a clock stretch, which works on the S2, but not the S3. Note that SDA goes high after a relatively short time in the top (S3, bad) trace.

image

dhalbert commented 2 years ago

I am going to ignore the S2 bitbangio issue for now. The LC709203F does clock stretching for several ms at a time, which seems to be an issue for the S3. I cannot reproduce the problem using https://github.com/adafruit/Adafruit_LC709203F/tree/master/examples/LC709203F_demo and removing the delay in loop(). I see clock stretches in that case, but they are handled properly. So there's something different about CircuitPython, perhaps again the multiple tasks we are running.

dhalbert commented 2 years ago

I exercised this problem in a different way (causes a crash) in a simple ESP-IDF program on several S3 boards, and reported it comprehensively in https://github.com/espressif/esp-idf/issues/8894#issuecomment-1125564756.

dhalbert commented 2 years ago

Pushing this to 7.x.x because it seems like an ESP-IDF issue.

dhalbert commented 2 years ago

I tested with the tip of https://github.com/espressif/esp-idf/tree/release/v4.4, which is later than 4.4.2, released two weeks ago, and I am still seeing this problem, unfortunately.

dhalbert commented 1 year ago

Tested again after #7023, which updates ESP-IDF to a very recent release/v4.4 commit. Still doesn't work. :frowning_face:

Gingertrout commented 1 year ago

Just throwing my 2 cents in, I have tried two STEMMA-QT boards on my new Feather ESP32-S3 and both report the same issue as well (output for my BNO055 9DoF)

import board
import adafruit_bno055

i2c = board.STEMMA_I2C()
sensor = adafruit_bno055.BNO055_I2C(i2c)
0;🐍Wi-Fi: off | code.py | 8.0.0-beta.1Traceback (most recent call last):
  File "code.py", line 4, in <module>
RuntimeError: No pull up found on SDA or SCL; check your wiring

Occasionally, after a reset of the Feather, it will get past board.STEMMA_I2C and fail at the adafruit_bno055:

0;🐍Wi-Fi: off | code.py | 8.0.0-beta.1
<I2C>
Traceback (most recent call last):
  File "code.py", line 5, in <module>
  File "adafruit_bno055.py", line 791, in __init__
  File "adafruit_bno055.py", line 224, in __init__
  File "adafruit_bno055.py", line 802, in _read_register
OSError: [Errno 116] ETIMEDOUT
BeatArnet commented 1 year ago

Recently I found a hack on a website (sorry - I forgot the name) which works fine in my environment: (EDIT by @dhalbert: formatted code)

# Der following code to scan the I2C-Adresses is necessary
i2c = board.I2C()
while not i2c.try_lock():
    pass
running = True
try:
    while running:
        print(
            "I2C addresses found:",
            [hex(device_address) for device_address in i2c.scan()],
        )
        # time.sleep(2)
        running = False

finally:  # unlock the i2c bus when ctrl-c'ing out of the loop
    i2c.unlock()

# Create sensor object, using the board's default I2C bus.
battery_monitor = LC709203F(board.I2C())

# Update to match the mAh of your battery for more accurate readings.
# Can be MAH100, MAH200, MAH400, MAH500, MAH1000, MAH2000, MAH3000.
# Choose the closest match. Include "PackSize." before it, as shown.
battery_monitor.pack_size = PackSize.MAH2000

# print("LC709203F thermistor test")
# print("Make sure a thermistor is connected to the board!")
# print("Make sure LiPoly battery is plugged into the board!")
# check your NTC thermistor datasheet for the appropriate B-Constant
battery_monitor.thermistor_bconstant = 3950
battery_monitor.thermistor_enable = True
# print("IC version:", hex(battery_monitor.ic_version))
# print("Battery Percent: {:.2f} %".format(battery_monitor.cell_percent))
# print("Battery Voltage: {:.2f} V".format(battery_monitor.cell_voltage))
mopore commented 1 year ago

Sali @BeatArnet and many thanks! This worked for me :clap:

I also formatted your example a bit:

# See: https://github.com/adafruit/Adafruit_CircuitPython_LC709203F
# Ensure to have "adafruit_lc709203f" installed (circup install adafruit_lc709203f)
import time
import board
from adafruit_lc709203f import LC709203F, PackSize

BATTERY_MON_ADDRESS = 0x0B  # Address for ESP32-s3 tft Feather

def hack_for_i2c_issue():
    i2c = board.I2C()
    while not i2c.try_lock():
        pass
    running = True
    try:
        while running:
            print("I2C addresses found:", 
                [hex(device_address) for device_address in i2c.scan()]
            )
            # time.sleep(2)
            running = False
        return i2c
    finally:  # unlock the i2c bus when ctrl-c'ing out of the loop
        i2c.unlock()

def main() -> None:

    try:
        print("Searching for battery monitor...")

        # There is a known issue with the ESP32-S3: 
        # https://github.com/adafruit/circuitpython/issues/6311
        # Therefore this will not work -> board.I2C()
        i2c = hack_for_i2c_issue()
        battery_monitor = LC709203F(i2c)

        # Update to match the mAh of your battery for more accurate readings.
        # Can be MAH100, MAH200, MAH400, MAH500, MAH1000, MAH2000, MAH3000.
        # Choose the closest match. Include "PackSize." before it, as shown.
        battery_monitor.pack_size = PackSize.MAH100  # type: ignore

        while True:
            print(f"Battery Percent: {battery_monitor.cell_percent:.2f} %")
            print(f"Battery Voltage: {battery_monitor.cell_voltage:.2f} V")
            time.sleep(2)

    except Exception as e:
        print(f"Error reading battery: {e}")
        print("All done.")

if __name__ == "__main__":
    main()
BeatArnet commented 1 year ago

Thanks a lot for the formatting!

dhalbert commented 1 year ago

I have reproduced this, and am working on some changes to the https://github.com/adafruit/Adafruit_CircuitPython_LC709203F library to do retries and delays in various places to minimize the throwing of errors. It was interesting that the Arduino library seemed to work OK, but upon further inspection, it appears that is due to the Arduino library and example program largely ignoring many errors that occur intermittently (!). Those errors cause intermittent bad data values but the Arduino code simply plows ahead.

I think are deifnitely issues with S3 I2C, but I think we can work around them with library changes

ThomasAtBBTF commented 1 year ago

Just for your information, I also had my share of issues with the build in I2C library. Since I switched to bitbangio.I2C all works fine... btw: I did not observe my problems with the LC709203F lib but with the INA219 and the AW9523 GPIO expander libs.

I am happy to help testing....

ThomasAtBBTF commented 1 year ago

Actually this test-setup works fine with the build in board.I2C() image

dhalbert commented 1 year ago

Is that an ESP32-S3 board?

ThomasAtBBTF commented 1 year ago

With this code: image

ThomasAtBBTF commented 1 year ago

Is that an ESP32-S3 board?

Yes, a Feather S3 from Unexpected Maker.

dhalbert commented 1 year ago

Please paste code as text if you can, thanks.

ThomasAtBBTF commented 1 year ago

(EDIT: added code formatting (three backticks))

print("hello battgauge")
import sys
import time
import board
import bitbangio
import supervisor
from adafruit_lc709203f import LC709203F

# i2c = bitbangio.I2C(board.SCL, board.SDA)
i2c = board.I2C()
if not i2c:
    print("init I2C went wrong!")
else:
    print("init I2C ok")
    while not i2c.try_lock():
        pass
    print("I2C locked!")
    # Find the first I2C device available.
    devices = i2c.scan()
    i2c.unlock()
    print("I2C unlocked!")
    for device in devices:  
        print(hex(device))
    print(devices)
    sensor = LC709203F(i2c)
    print("IC version:", hex(sensor.ic_version))
    i = 0
    while True:
        i += 1
        print(f"{i:5} Battery: {sensor.cell_voltage:0.3f} Volts / {sensor.cell_percent:0.1f}%")
        time.sleep(1)
ThomasAtBBTF commented 1 year ago

image

And it actually worked 5333 seconds without a error!

ThomasAtBBTF commented 1 year ago

image

This is the CP-Version I am using.

dhalbert commented 1 year ago

See https://github.com/adafruit/Adafruit_CircuitPython_LC709203F/pull/21 for a proposed workaround for this problem.

dhalbert commented 1 year ago

Pushing this forward to 8.x.x because I don't see an upstream fix in sight.

dhalbert commented 1 year ago

A kind user in another issue pointed me to this S3 I2C fix: https://github.com/espressif/esp-idf/issues/9777#issuecomment-1359582249. I'll test this soon.

dhalbert commented 1 year ago

Moved to 8.0.0 to test the patch mentioned in the previous comment.

BeatArnet commented 1 year ago

Dear Dan Perfect, now it works as expected: A wake up and a measurement every 2 minutes. Thanks a lot for your help! Beat

dhalbert commented 1 year ago

@BeatArnet I haven't yet put the C patch into the 8.0.0 (main) line. Could you describe what is now working for you? Thanks. Are you using 8.0.0-beta.6 with the latest version of the LC709203F library, and that is working well? I had added some retry logic to the library.

BeatArnet commented 1 year ago

I only installed the latest 8.0.0-beta.6 without changes to my code and without the latest LC709203F library. I shall test that later.

dhalbert commented 1 year ago

I tried the https://github.com/espressif/esp-idf/issues/9777#issuecomment-1359582249 patch and it did not make the S3 act like the S2, which does not need delays or retries. :frowning_face:

BeatArnet commented 1 year ago

After installing the latest version of adafruit_lc709203f I got sporadic errors (about every 4th to 8th time). Therefore, I continue to work with the hack, which works wonderfully for days.

dhalbert commented 1 year ago

After installing the latest version of adafruit_lc709203f I got sporadic errors (about every 4th to 8th time). Therefore, I continue to work with the hack, which works wonderfully for days.

Hi - The library change was meant to imitate the hack you mentioned. It scans for the device multiple times to wake it up. Could you post the code you are using that is running for days, and the code you tried with the latest library that is _not_working consistently? I am interested in what is different. Thanks.

BeatArnet commented 1 year ago

Working fine:

from adafruit_lc709203f import LC709203F, PackSize
BATTERY_MON_ADDRESS = 0x0B  # Address for ESP32-s3 tft Feather

def hack_for_i2c_issue():
    i2c = board.I2C()
    while not i2c.try_lock():
        pass
    running = True
    try:
        while running:
            print("I2C addresses found:", 
                [hex(device_address) for device_address in i2c.scan()]
            )
            # time.sleep(2)
            running = False
        return i2c
    finally:  # unlock the i2c bus when ctrl-c'ing out of the loop
        i2c.unlock()

try:
    print("Searching for battery monitor...")
    i2c = hack_for_i2c_issue()
    battery_monitor = LC709203F(i2c)   
    battery_monitor.pack_size = PackSize.MAH2000  # type: ignore
    print(f"Battery Percent: {battery_monitor.cell_percent:.2f} %")
    print(f"Battery Voltage: {battery_monitor.cell_voltage:.2f} V")
    time.sleep(1)

Produces sporadic errors:

from adafruit_lc709203f import LC709203F
i2c = board.I2C()  
battery_monitor = LC709203F(i2c)
battery_monitor.pack_size = PackSize.MAH2000  # type: ignore
dhalbert commented 1 year ago

@BeatArnet Thanks -- these are single-reading examples, but I'd assume you're reading in a loop? Could you post the whole program(s)? Uploading in a .zip is fine if large.

dhalbert commented 1 year ago

btw, if you upgraded the library, make sure the only copy is in lib/. A copy at the top-level in CIRCUITPY will override what is in lib/, and a .py version in /lib will override a .mpy version.

BeatArnet commented 1 year ago

I used the library "adafruit-circuitpython-bundle-8.x-mpy-20230105.zip" The code is in the attached file 3Sensors_MQTT with deepsleep.zip

dhalbert commented 1 year ago

Recently I found a hack on a website (sorry - I forgot the name) which works fine in my environment:

@BeatArnet I would like to track this down to read about this technique in context. Do you remember if the author did the original coding in CircuitPython, MicroPython, regular Python (e.g., on RPi), or Arduino, or something else? Thanks.

BeatArnet commented 1 year ago

Sorry, I couldn't find the author of the code. But I did not even have to adapt the code.

dhalbert commented 1 year ago

Are you saying that the code you took is exactly as you wrote in https://github.com/adafruit/circuitpython/issues/6311#issuecomment-1295812087 ? If so, then I will do a websearch for that.

BeatArnet commented 1 year ago

Same code, but I did not find on the Web…

VictorioBerra commented 1 year ago

Came here from Google, brand new ESP32-S3 here, just hit this problem.

Should I just wrap the call in a try/catch? Would that work?

cherrydev commented 1 year ago

I would suggest to people that they attempt to add an additional pull-up resistor in order to solve this issue. I've used my analog scope to monitor the I2C bus in the default configuration and the rise time for the I2C signals are very much out of spec (rise time is 2.5us & spec max is 1us). While the documentation for the FeatherS3 indicates a 5k pull-up, looking at the schematic seems to show that it's actually 10k (though I could be reading it wrong). In any case, the I2C bus behaves much more reliably if you add a 2k resistor between the 3.3V pin and each of the I2C pins.

dhalbert commented 1 year ago

@cherrydev Interesting. The LC709203F datasheet shows 10k as an example, so theoretically it should be OK. It works fine on ESP32-S2, but perhaps there is something hw or sw-related on S3 that causes the slow rise time. Have you looked at the rise time on an S2 board?

cherrydev commented 1 year ago

@dhalbert I do not have an S2 board to test it on. In the past, though, I've found a lot of I2C problems to go away with stronger pull-up, so it might be worth a try. While 10k might theoretically be fine, it also depends on details of the host's I2C circuit, including capacitance, which I only know how to measure with the board unpowered, which might not give an accurate measurement due to the ESP's GPIO MUX. I can test an original ESP32 using a 10k pullup and see if I get similarly slow rise times.

DJDevon3 commented 1 year ago

Got an error on the S3 using the LC709203F, might be related?

image0

Error in my script is currently line 407 in this script

As suggested by anecdata I've now added into my S3 weather station script a try/except around it.

i2c = board.I2C()
battery_monitor = LC709203F(board.I2C())
battery_monitor.thermistor_bconstant = 3950
battery_monitor.thermistor_enable = True

try:
        vbat_label.text = f"{battery_monitor.cell_voltage:.2f}"
    except OSError as e:
        print("OSError:", e)

It generally takes about 10-12 hours for the read failure to occur while polling every 15 minutes. I'm not using any power mode switching code, just straight to DisplayIO TFT. Will let this run for a while and report back.

dhalbert commented 1 year ago

@DJDevon3 Yes, this seems like the same issue.

DJDevon3 commented 1 year ago

Ran it with the try/except for 24 hours successfully. The device rebooted when I unplugged my multi-card reader. Windows must have gone around and touched every other USB device because the S3 weather station rebooted. That reset the clock back to 0 for me in looking into this issue. 24 hours without a single error is definitely an improvement as before it was about 10-12 hours reliably failing two days in a row. Granted I've only had the feather weather station up and running for 2 days since having it shutoff for months.

There is nothing in REPL showing the OSError print. Seems like a simple try/except might be a good workaround or I simply didn't run into it for the past 24 hours?

===============================
Failed to get data, retrying
 Sending request failed
USB Sense:  True
Attempting to GET Weather!
===============================

The only error I got over the past 24 hours was a wifi failure which failed gracefully, recovered, and reconnected. Not a single OSError so I don't think it even triggered. If it shows up it'll be directly below USB Sense: True. Not a single entry in my logs.

Because I run this every single day right in front of my main monitor, polling every 15 minutes, I'll definitely pickup any errors. Keeping Mu open to track the serial output. Will keep an eye on it this week and report back if I have more data to contribute.

DJDevon3 commented 1 year ago

Running 5 days straight without a single error using try/except.

try:
        vbat_label.text = f"{battery_monitor.cell_voltage:.2f}"
    except OSError as e:
        print("OSError:", e)

That's all that is required now as far as I can tell.