PiSupply / PiJuice

Resources for PiJuice HAT for Raspberry Pi - use your Pi Anywhere
https://uk.pi-supply.com/collections/pijuice/products/pijuice-portable-power-raspberry-pi
GNU General Public License v3.0
436 stars 104 forks source link

I2C transfer timed out issue #634

Open rene54321 opened 3 years ago

rene54321 commented 3 years ago

Hi,

I am using an fresh installed RPi4B with your PiJuice and get the following failures in dmesg: [17565.248606] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [17570.368665] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [17584.288837] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [30585.249280] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [30590.369359] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [30603.489521] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42248.352864] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42253.472928] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42267.393102] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42276.513222] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58842.957253] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58848.077322] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58861.997473] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58886.157754] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58891.277827] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58905.198042] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69088.764142] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69093.884238] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69107.804378] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69116.924509] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69276.686472] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69281.806547] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69295.726700] i2c-bcm2835 fe804000.i2c: i2c transfer timed out

Is this an known bug? Do you have an idea to fix this?

best regards René

rene54321 commented 3 years ago

i2cdetect runs very fast and gives the following result:

openhabian@openHABian-RPi4:~ $ i2cdetect -y 1 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: -- -- -- -- -- -- -- -- -- -- -- -- -- 10: -- -- -- -- 14 -- -- -- -- -- -- -- -- -- -- -- 20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 60: -- -- -- -- -- -- -- -- UU -- -- -- -- -- -- -- 70: -- -- -- -- -- -- -- -- openhabian@openHABian-RPi4:~ $

tvoverbeek commented 3 years ago

What is the firmware version in your PiJuice? 1.3 has issues with the RPi4B. The I2C code in the firmware has been replaced in 1.4 and should work fine with the RPI4B.

rene54321 commented 3 years ago

Hi,

I am using the latest FW 1.4B for the PiJuice and get these failures.

shawaj commented 3 years ago

@rene54321 there isn't a 1.4B firmware?

Guessing you are using openhabian from the output above?

From googling it seems that there is lots of i2c issues with this.

Please can you try with standard Raspberry Pi OS.

shawaj commented 3 years ago

@rene54321 also are you using some other devices connected to the Pi as well as the PiJuice?

rene54321 commented 3 years ago

Hi,

I am using the 1.4 firmware sorry. In my case it is needed to use the openhabian as the main smarthome system is openhab and I use the wonderful PiJuice to have it still running during power outage.

best regards René

rene54321 commented 3 years ago

Hi,

I am using an display and an PIR sensor, both of them are not using I2C connection.

best regards René

shawaj commented 3 years ago

@rene54321 please provide further details of the display and the pir and how they are connected. There could be a clash.

Also, the reason I suggested to try with Raspberry Pi OS is for testing purposes. Because if it works ok then it means it is an issue with openhabian and they will need to fix it on their side.

rene54321 commented 3 years ago

Hi sure:

Display connection: Display | Raspberry LED | GPIO23 SCK | GPIO11 SDI | GPIO10 D/C | GPIO24 RESET | GPIO25 CS | GPIO08 GND | Ground VCC | 3.3v

PIR: GPIO 18

FAN: 3.3V

Additonal to the i2c timeout I receive the following errors:

[95411.328773] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [95411.328800] rtc-ds1307 1-0068: write error -110 [95413.023261] rtc-ds1307 1-0068: read error -121 [95413.329177] rtc-ds1307 1-0068: write error -5

tvoverbeek commented 3 years ago

@rene54321 Your i2cdetect output shows only the PiJuice connected. The display uses SPI (GPIOs 10 and 11).

Do you specify the ds1307 RTC in /boot/config.txt? If so, you should put the PiJuice ID-EEPROM address to 0x52, otherwise you get a conflict with the specified ds1339 RTC in the device-tree fragment of the PiJuice EEPROM. Also which kernel is openhabian using (output of uname -a).

rene54321 commented 3 years ago

Hi,

yes part of config.txt:

load RTC module

dtoverlay=i2c-rtc,ds1339

I use the follwing kernel: Linux openHABian-RPi4 5.4.83-v7l+ #1379 SMP Mon Dec 14 13:11:54 GMT 2020 armv7l GNU/Linux

tvoverbeek commented 3 years ago

OK. That is also the most recent kernel for Raspberry PiOS. You are probably aware of the fact that the RPi lowers its clocks when inactive and increases it when the CPUs are loaded. This also happens to the i2c clock. These i2c clock variations may cause i2c errors. You can try to add the line forceturbo=1 to /boot/config.txt. This ensures that all clocks run at a constant (max) frequency. See if that decreases the amount of i2c errors.

pelwell commented 3 years ago

Locking the core clock to a fixed value will have the same effect, but with lower power consumption and less increase in CPU temperature than force_turbo. For Pi 4:

core_freq=500
core_freq_min=500

UPDATE: While typing this, the Pi 4 with a PiJuice under test just got a read error from the RTC, followed by i2c transfer timed out. This was with core_freq=500 and core_freq_min=500.

pelwell commented 3 years ago

My test script is:

while true; do
    sudo hwclock -r
    vcgencmd measure_clock core
    sleep 0.5
done

(the idea being to boost the number of I2C accesses in the hope of hitting the problem sooner)

rene54321 commented 3 years ago

Hi I have changed the eeprom to 0x52. But still get the errors:

[20275.755407] i2c-bcm2835 fe804000.i2c: i2c transfer timed out

pelwell commented 3 years ago

I've just caught a failure in a logic analyser.

tvoverbeek commented 3 years ago

@pelwell if you have the cli or gui open on the RTC tab there might be a conflict between the pijuice service accessing the RTC and the OS accessing the RTC simultaneously.

pelwell commented 3 years ago

I don't have the CLI or GUI open - FYI, I only did a CLI install.

pelwell commented 3 years ago

This is the start of a normal read of the RTC: pijuice_i2c_good This is it failing: pijuice_i2c_bad_1 The SDA1 went low way back here: pijuice_i2c_bad_2 The delay between SDA1 and SCL1 going low is 555ms: pijuice_i2c_bad_3 What on earth is happening between those two events?

tvoverbeek commented 3 years ago

@pelwell Overlapping transactions to 0x14 and 0x68 ???? Both addresses are handled by the same STM32

pelwell commented 3 years ago

Something like that. This is the RTC access n-2: pijuice_i2c_bad_4 And the following 0x14 access: pijuice_i2c_bad_5 Now RTC n-1: pijuice_i2c_bad_6 Notice how it seems to stop mid-way through a read.

What looked like the next 0x14 access also includes the end of the failed 0x68 read: pijuice_i2c_bad_7

The wider view shows that the n-1 block of 0x68 accesses is short: pijuice_i2c_bad_8

pelwell commented 3 years ago

You can download the capture in Saleae's .sal format (192KB) here: https://drive.google.com/file/d/1HMekPX5dxFwg7r6jOnzsNvnuyXWJQm7H/view?usp=sharing

pelwell commented 3 years ago

This trace, with the pijuice service stopped, is clearer: pijuice_i2c_bad_2_1 Is this massive clock stretching, or is the I2C clock being stopped for some other reason?: pijuice_i2c_bad_2_2

.sal file here: https://drive.google.com/file/d/1FyWw-P0WvChzVlzpo1bL2Yq4rrCV-3Dx/view?usp=sharing

pelwell commented 3 years ago

Replacing the PiJuice with a DS3231 shows the same problem, so it isn't down the PiJuice hardware. It's also looking as though the system has to be busy in some way for it to fail - playing YouTube videos in Chromium works - but it's possible that I just need to be more patient.

rene54321 commented 3 years ago

Hi,

I have tested the RPi4 now with the paramater force_turbo=1 in the config.txt. I dont receive any failures since this change. From my point of view this is not an good solution because of waste of energy together with high temps of the PRi. Is there any way by the PiJuice team to improve this behaviour?

best regards René

pelwell commented 3 years ago

I can still make it fail with force_turbo=1, and I don't think this is a fault with the PiJuice.

pelwell commented 3 years ago

@rene54321 What is active on your system when it fails?

rene54321 commented 3 years ago

Good question, the failure is not all the time only approx. each 10000s. How am I able to check this?

pelwell commented 3 years ago

I meant what function is the Pi performing? My killer app is YouTube, but if yours is largely idle then it will be difficult to work out what was running at that moment.

Have you tried adding dtoverlay=i2c-bcm2708 to config.txt? This switches to the old, downstream I2C driver.

pelwell commented 3 years ago

I've not had a failure since enabling the old i2c-bcm2708 driver. If you want to provoke errors more quickly, run while true; do sudo hwclock -r; done in the background to boost I2C traffic.

pelwell commented 3 years ago

I've pushed a patch to rpi-5.10.y that modifies the interrupt handling in a way that should make it more tolerant to interrupts from the other I2C bus (they all share an IRQ line), delayed interrupts etc. In my testing it's made a massive difference, to the extent that I've not seen a failure yet despite an artificially high I2C load.

rene54321 commented 3 years ago

Hi,

was not working for me. After the update and changing the settings back to ondemand cpu governor the failure appeared again. So I went back to "force_turbo = 1" for now.

best regards René

f18m commented 6 months ago

hi @pelwell , @rene54321 , I'm having a similar problem: I have a Raspberry PI 1 model B+ connected with I2C to an IO-expander HAT (https://sequentmicrosystems.com/products/16-universal-inputs-card-for-raspberry-pi). I poll the IO expander chip at 1Hz... after some hour of operation my software fails and I start to see a bunch of:

Mar 02 23:35:04 raspberrypi kernel: i2c-bcm2835 20804000.i2c: i2c transfer timed out
Mar 02 23:35:05 raspberrypi kernel: i2c-bcm2835 20804000.i2c: i2c transfer timed out
Mar 02 23:35:06 raspberrypi kernel: i2c-bcm2835 20804000.i2c: i2c transfer timed out
Mar 02 23:35:07 raspberrypi kernel: i2c-bcm2835 20804000.i2c: i2c transfer timed out

The software "i2cdetect -y 1" takes about 2 seconds to output each cell and the I2C bus seems hung (forever?). I need to power cycle the board to get it back to work (a simple reboot didn't fix). Is it possible my issue is this same issue?

I've pushed a patch to rpi-5.10.y that modifies the interrupt handling in a way that should make it more tolerant to interrupts from the other I2C bus (they all share an IRQ line), delayed interrupts etc. In my testing it's made a massive difference, to the extent that I've not seen a failure yet despite an artificially high I2C load.

@pelwell how do I know if that patch made its way to my Raspbian OS ? I'm running kernel

Linux raspberrypi 6.1.0-rpi7-rpi-v6 #1 Raspbian 1:6.1.63-1+rpt1 (2023-11-24) armv6l GNU/Linux

thanks!

pelwell commented 6 months ago

The patch is not included - it was even reverted in 5.10 because it caused a regression with another device.

f18m commented 6 months ago

The patch is not included - it was even reverted in 5.10 because it caused a regression with another device.

Ouch, thanks for the update. What's your suggested workaround then to I2C bus issues? Force_turbo=1?