raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.13k stars 4.98k forks source link

Issue with i2c-bcm2835 module with I2C devices? #3064

Open shawaj opened 5 years ago

shawaj commented 5 years ago

When trying the PiJuice on Pi 4B using Raspbian Buster out of the box we get a COMMUNICATION ERROR in the PiJuice software although i2cdetect reported the usual addresses (0x14 and 0x68).

Turned out we had to use the older i2c-bcm2708 module instead of the default i2c-bcm2835 module.

With 'dtoverlay=i2c-bcm2708' in /boot/config.txt both the PiJuice HAT and Zero work.

On the Pi 3B+ the PiJuice works fine with the default i2c-bcm2835 module.

Do not know what the differences are in the new BCM2711 SoC w.r.t. I2C hardware.

Is there any known issues that might be causing this?

pelwell commented 5 years ago

We're not aware of any major changes in the I2C block, although it has been mentioned that clock stretching should work now, however yours isn't the first report suggesting there may be a problem.

Which devices does PiJuice use? Do you know if clock-stretching is involved? Can you get more detailed failure state than "COMMUNICATION ERROR"?

shawaj commented 5 years ago

copying in @tvoverbeek and @mmilann

The communication error is in lines 94 and 110 of this script - https://github.com/PiSupply/PiJuice/blob/f571b4384aea02c75073204db5213538a9c02fbf/Software/Source/pijuice.py#L94

The I2C initialisation is here - https://github.com/PiSupply/PiJuice/blob/f571b4384aea02c75073204db5213538a9c02fbf/Software/Source/pijuice.py#L17

We are performing more testing on our side and will report back any further findings. Copying @ryanteck and @ChristopherRush also

pelwell commented 5 years ago

Could you retest with the latest rpi-update kernel + firmware (and without your overlay workaround)? There have been some clocking changes which may have solved the problem. If not, I'll need to get hold of a PiJuice.

shawaj commented 5 years ago

@pelwell what's your address? Raspberry Pi office?

tvoverbeek commented 5 years ago

All this on Pi4B with i2c_bcm2835 I turned on debugging of i2c_bcm2835 (debug=3). This is the output of a failing transaction with the official kernel (Linux raspberrypi 4.19.50-v7l+ #895 SMP Thu Jun 20 16:03:42 BST 2019 armv7l GNU/Linux);

Jul 12 14:03:04 raspberrypi kernel: [  113.355057] bcm2835_i2c_xfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 14:03:04 raspberrypi kernel: [  113.355073] bcm2835_i2c_xfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 14:03:04 raspberrypi kernel: [  113.360123] start_transfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 14:03:04 raspberrypi kernel: [  113.360137] isr: remain=1, status=0x10000055 : TA TXW TXD TXE  [i2c1]
Jul 12 14:03:04 raspberrypi kernel: [  113.360150] start_transfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 14:03:04 raspberrypi kernel: [  113.360162] isr: remain=2, status=0x232 : DONE TXD RXD CLKT  [i2c1]
Jul 12 14:03:04 raspberrypi kernel: [  113.360179] i2c-bcm2835 fe804000.i2c: i2c transfer failed: 200

After rpi-update (Linux raspberrypi 4.19.57-v7l+ #1244 SMP Thu Jul 4 18:48:07 BST 2019 armv7l GNU/Linux) transactions still failing but different error code (ERR added on read transaction):

Jul 12 14:15:26 raspberrypi kernel: [  108.806829] bcm2835_i2c_xfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 14:15:26 raspberrypi kernel: [  108.806844] bcm2835_i2c_xfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 14:15:26 raspberrypi kernel: [  108.809903] start_transfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 14:15:26 raspberrypi kernel: [  108.809919] isr: remain=1, status=0x10000055 : TA TXW TXD TXE  [i2c1]
Jul 12 14:15:26 raspberrypi kernel: [  108.809932] start_transfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 14:15:26 raspberrypi kernel: [  108.809944] isr: remain=2, status=0x332 : DONE TXD RXD ERR CLKT  [i2c1]
Jul 12 14:15:26 raspberrypi kernel: [  108.809961] i2c-bcm2835 fe804000.i2c: i2c transfer failed: 300

The PiJuice has a ST Micro STM32F030CCT6 MCU which handles the I2C traffic. In the meantime we also have a changed firmware for the PiJuice with changed I2C timing. Will test that too and report later.

pelwell commented 5 years ago

@shawaj Yes, the new office on Cowley Road.

pelwell commented 5 years ago

@tvoverbeek Do you have an equivalent trace from a Pi3 showing what success looks like?

tvoverbeek commented 5 years ago

Yes, will post that before testing the changed firmware

tvoverbeek commented 5 years ago

Trace with successful transaction from a Pi3B, kernel (Linux rpi-pijuice-ref 4.19.57-v7+ #1244 SMP Thu Jul 4 18:45:25 BST 2019 armv7l GNU/Linux) Note: same kernel as after rpi-update on Pi4B, but this was after a normal apt-get update/upgrade yesterday.

Jul 12 13:39:35 rpi-pijuice-ref kernel: [  217.867250] bcm2835_i2c_xfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 13:39:35 rpi-pijuice-ref kernel: [  217.867266] bcm2835_i2c_xfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 13:39:35 rpi-pijuice-ref kernel: [  217.868248] start_transfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 13:39:35 rpi-pijuice-ref kernel: [  217.868259] isr: remain=1, status=0x10000055 : TA TXW TXD TXE  [i2c1]
Jul 12 13:39:35 rpi-pijuice-ref kernel: [  217.868269] start_transfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 13:39:35 rpi-pijuice-ref kernel: [  217.868278] isr: remain=2, status=0x32 : DONE TXD RXD  [i2c1]
tvoverbeek commented 5 years ago

On Pi4B after apt-get update/upgrade now with kernel Linux raspberrypi 4.19.57-v7l+ #1244 SMP Thu Jul 4 18:48:07 BST 2019 armv7l GNU/Linux and i2c_bcm2835 and updated PiJuice firmware PiJuice-V1.4_2019_07_11.elf.binary success:

Jul 12 15:16:34 raspberrypi kernel: [  144.777487] bcm2835_i2c_xfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 15:16:34 raspberrypi kernel: [  144.777503] bcm2835_i2c_xfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 15:16:34 raspberrypi kernel: [  144.778686] start_transfer: msg(1/2) write addr=0x14, len=1 flags= [i2c1]
Jul 12 15:16:34 raspberrypi kernel: [  144.778700] isr: remain=1, status=0x10000055 : TA TXW TXD TXE  [i2c1]
Jul 12 15:16:34 raspberrypi kernel: [  144.778713] start_transfer: msg(2/2) read addr=0x14, len=2 flags= [i2c1]
Jul 12 15:16:34 raspberrypi kernel: [  144.778725] isr: remain=2, status=0x32 : DONE TXD RXD  [i2c1]

So it seems we can solve it on our side. Still curious why it worked on Pi3B and not on a Pi4B with the i2c_bcm2835 module. Timing issue?

pelwell commented 5 years ago

The CLKT status flag indicates a clock-stretch timeout - the slave has held SCL low for too long. A mythical clock-stretch timeout was fixed for 2711, but apparently without requiring any driver changes.

Another factor pointing towards the timeout is the fact that the old bcm2708 driver (which initialises the TOUT register to a value representing 35ms) works, while the bcm2835 driver (which doesn't change the value from the default of 40 clock cycles) fails. 40 clocks on a 1MHz bus is only 40us - what clock speed do you run at for PiJuice, and what is the maximum clock-stretch you might expect?

At the very least the bcm2835 driver should also initialise the TOUT register to 35ms - I have no idea why it doesn't already.

pelwell commented 5 years ago

Ah - there's a minor bug in the old driver:

    clk_tout = 35/1000*baud; //35ms timeout as per SMBus specs.
    if (clk_tout > 0xffff)
        clk_tout = 0xffff;

Can you spot it?

Hint: the result is to disable timeouts completely.

tvoverbeek commented 5 years ago

The clk_tout = 35/1000*baud; //35ms timeout as per SMBus specs. will result in a clk_tout of 0 (integer division)

pelwell commented 5 years ago

There's a pull request here (https://github.com/raspberrypi/linux/pull/3068), in case any kernel builders have a spare moment test it with their devices before we merge it.

lategoodbye commented 5 years ago

There has been an attempt to upstream a similiar fix: https://www.spinics.net/lists/linux-i2c/msg33661.html

Can anyone who is affected also test this patch?

pelwell commented 5 years ago

The PiJuice arrived - thanks - and I've installed the pijuice-gui package. Running on Pi4 with the latest rpi-update kernel, all seems to be well.

tvoverbeek commented 5 years ago

@pelwell Phil. is that the 4.19.58 kernel from yesterday? It was definitely not OK on 4.19.57

pelwell commented 5 years ago

Yes - you want the 4.19.58 kernel, because the clock-stretch timeout patch was in the same update.

tvoverbeek commented 5 years ago

Just ran rpi-update. So with the 4.19.58 kernel and the PiJuice 1.3 firmware I get no errors in the kernel log, but the received data from the PiJucie is corrupted. (There is a checksum as the last byte in each message) Earlier I build your fixed i2c-bcm2835 module myself with the 4.19.57 kernel and got the same corruption. So something is still rotten somewhere. I am wondering if this bit of code in i2c-bcm2835.c has something to do with it:

/*
 * This HW was reported to have problems with clock stretching:
 * http://www.advamation.com/knowhow/raspberrypi/rpi-i2c-bug.html
 * https://www.raspberrypi.org/forums/viewtopic.php?p=146272
 */
static const struct i2c_adapter_quirks bcm2835_i2c_quirks = {
        .flags = I2C_AQ_NO_CLK_STRETCH,
};

I understand the new SoC has the clock stretch bug fixed. With the i2c-bcm2708 module on the 4.19.58 kernel everything is fine.

Did you check the PiJuice GUI or only the debug output of the i2c-bcm2835 module? The debug output shows no errors with the 35ms timeout change.

pelwell commented 5 years ago

I used the GUI enough to convince me it was working - I could see the battery charge increasing, for example. However, on the first attempt on a Pi 4 I could see that all the responses from the device were 0xFF, at if the bus was stuck high. It was only after trying on a 3B+ and switching back to the 4 that it worked, so I don't know what changed.

tvoverbeek commented 5 years ago

I happen to have 3 RPi4B (1, 2 and 4GB, Thanks PiSupply) and tried with all of them. All 3 when using the 1.3 firmware on the 4.19.58 kernel have failed transactions (DATA CORRUPTED) on the GUI. Some of the transactions however are successful. When looking with a protocol analyzer the problem seems to be when using read_i2c_block_data. This is done in 2 transactions: First: address, command, Second: address, get data from slave. The Ack on the address of the second transaction fails, data remains high and clock stops. After a pause the clock resumes and data words are read with data high (-> 0xFF). @pelwell What version of the firmware is on your PiJuice? My PiJuice is the original KickStarter one. I assume you got one of the new batch from PiSupply.

shawaj commented 5 years ago

@tvoverbeek yes I sent @pelwell one of the units that arrived this week

pelwell commented 5 years ago

I'm on firmware 1.3 and I'm not seeing any problems with a recent kernel. As I mentioned earlier it did fail the first time I tried, but since then it's been fine.

ryanteck commented 5 years ago

Just to add to this: I've been working on a different product of ours that also uses I2C with an ATMEGA328 and arduino language / libraries etc as the slave.

As the host device can be either a micro:bit, Arduino or Raspberry Pi, and usually reads 4 bytes X 3 in a row. On the micro:bit and arduino I'm not experiencing any issues at all.

However on the Raspberry Pi I experience issues with the normal driver (and ran rpi-update just now which did improve it but not perfect). I have a python program that loops every second to perform two read_i2c_block_data in a row. With the older driver I've had it not fail at all (Over 500 repetitions of the loop). However with the new driver this can be as little as 10 repetitions for it to crash (assuming it's even worked to begin with) , the best I've had after the rpi-update is around 100. Sometimes it'll work but the data responded from the two read commands will be the same despite requesting different registers.

This is on a Model 2.

lategoodbye commented 5 years ago

@tvoverbeek Regarding to the I2C_AQ_NO_CLK_STRETCH quirk, i submitted a patch to linux-i2c, which will hopefully be merged for Linux 5.4. But i don't expect any real changes because i'm not aware of any I2C driver which cares about this quirk.

asahajit commented 4 years ago

I think my issue is related to this. got issue in RPI 3b+
it was working fine for the last few days. tried to restart no luck What I am doing: (Reading Arduino(Slave) data via I2C to PI(Master) ) Aug 13 14:14:18 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error Aug 13 14:14:20 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error Aug 13 14:14:21 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error Aug 13 14:14:11 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error Aug 13 14:14:22 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error Aug 13 14:14:22 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error Aug 13 14:14:23 piproject python3[1425]: ERROR : [Errno 121] Remote I/O error kernel: [173138.503253] i2c-bcm2835 3f804000.i2c: i2c transfer timed out

Update: After I reloaded the same program to Arduino it worked fine.

rene54321 commented 3 years ago

Hi,

I am using an fresh installed RPi4B with your PiJuice and get the following failures in dmesg: [17565.248606] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [17570.368665] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [17584.288837] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [30585.249280] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [30590.369359] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [30603.489521] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42248.352864] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42253.472928] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42267.393102] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [42276.513222] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58842.957253] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58848.077322] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58861.997473] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58886.157754] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58891.277827] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [58905.198042] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69088.764142] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69093.884238] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69107.804378] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69116.924509] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69276.686472] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69281.806547] i2c-bcm2835 fe804000.i2c: i2c transfer timed out [69295.726700] i2c-bcm2835 fe804000.i2c: i2c transfer timed out

Is this an known bug? Do you have an idea to fix this?

best regards René

6by9 commented 3 years ago

[17565.248606] i2c-bcm2835 fe804000.i2c: i2c transfer timed out would normally imply excess clock stretching due to SCL being tied to ground.

Does sudo i2cdetect -y 1 run sensibly, ie complete in about a second, and report some but not all addresses as being active? You may need to install i2c-tools, and/or load i2c-dev in order to be able to run i2cdetect.

rene54321 commented 3 years ago

i2cdetect runs very fast and gives the following result:

openhabian@openHABian-RPi4:~ $ i2cdetect -y 1 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: -- -- -- -- -- -- -- -- -- -- -- -- -- 10: -- -- -- -- 14 -- -- -- -- -- -- -- -- -- -- -- 20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 60: -- -- -- -- -- -- -- -- UU -- -- -- -- -- -- -- 70: -- -- -- -- -- -- -- -- openhabian@openHABian-RPi4:~ $

shawaj commented 3 years ago

I am using an fresh installed RPi4B with your PiJuice and get the following failures in dmesg:

For PiJuice enquiries you can also go to the PiJuice github https://github.com/PiSupply/PiJuice/issues

shawaj commented 3 years ago

@rene54321 ah I just saw you also opened an issue at https://github.com/PiSupply/PiJuice/issues/634 already

rene54321 commented 3 years ago

Yes, hopefully someone has an solution.

pelwell commented 3 years ago

I've pushed a patch to rpi-5.10.y that modifies the interrupt handling in a way that should make it more tolerant to interrupts from the other I2C bus (they all share an IRQ line), delayed interrupts etc. In my testing it's made a massive difference, to the extent that I've not seen a failure yet despite an artificially high I2C load.

popcornmix commented 3 years ago

rpi-update contains the potential fix for this. Please test and report.

rene54321 commented 3 years ago

Hi,

what is needed to get this update only sudo rpi-update?

best regards René

pelwell commented 3 years ago

Yes. Eventually it will be available using "apt upgrade", but not yet.

rene54321 commented 3 years ago

Hi,

was not working for me. After the update and changing the settings back to ondemand cpu governor the failure appeared again. So I went back to "force_turbo = 1" for now.

best regards René

tvoverbeek commented 3 years ago

@pelwell Phil, did some testing. Pi4B V1.1 2GB with PiJuice HAT. No other peripherals.

Ran your i2c load script, pijuice service active with decreased polling time (to increase i2c load) and watching a 14 minutes Raspberry pi pico video on youtube via VNC (Pi4B has no display connected).

With kernel 5.10.11 I get plenty of i2c connection timed-out errors for the RTC. With the 5.10.14 kernel (after rpi-update) no i2c connection time-outs. So for me it seems your patch solved at least this problem (i2c time-out under load) @rene54321 Are you sure you were running with kernel 5.10.14 which contains the patch?

rene54321 commented 3 years ago

Hi,

I am running the following version:

Linux openHABian-RPi4 5.10.14-v7l+

best regards René