raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.02k stars 4.95k forks source link

I2C i2c_smbus_xfer Kernel stack is corrupted. Kernel panic. #510

Closed slavazbox closed 8 years ago

slavazbox commented 10 years ago

I'm working with smbus via python-smbus, from time to time my Raspberrypi Pi Model B crashes

rpi1 login: [85590.341926] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c02ff248
[85590.341926] 
[85590.356866] CPU: 0 PID: 16004 Comm: python Not tainted 3.10.25+ #622
[85590.364989] [<c0013a18>] (unwind_backtrace+0x0/0xf0) from [<c0010d7c>] (show_stack+0x10/0x14)
[85590.377048] [<c0010d7c>] (show_stack+0x10/0x14) from [<c03fc1bc>] (panic+0x94/0x1e4)
[85590.388330] [<c03fc1bc>] (panic+0x94/0x1e4) from [<c001e2c8>] (__stack_chk_fail+0x10/0x14)
[85590.400234] [<c001e2c8>] (__stack_chk_fail+0x10/0x14) from [<c02ff248>] (i2c_smbus_xfer+0x590/0x5a4)
[85590.413098] [<c02ff248>] (i2c_smbus_xfer+0x590/0x5a4) from [<ffffffff>] (0xffffffff)
PANIC: stack-protector: Kernel stack is corrupted in: c02ff248

I'm using the latest released version of Raspbian. I also tried to update the firware with rpi-update -- no difference.

popcornmix commented 10 years ago

Can you confirm if this happens with latest 3.12.23 kernel?

chrisb2 commented 10 years ago

I am having a very similar problem on the 3.12.22 kernel (the latest?), see attached screen shot of console when this occurs.

i2c-kdb

P33M commented 10 years ago

Can you remove the parameter kgdboc=ttyAMA0,115200 from /boot/cmdline.txt and repeat the test? This should make the full stack trace appear.

chrisb2 commented 10 years ago

Will do. Where can I find the output I see on the screen after I reboot? I have look in various files in /var/log but could not find it.

chrisb2 commented 10 years ago

My cmdline.txt does not contain that param:

dwc_otg.lpm_enable=0 console=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline rootwait
slavazbox commented 10 years ago

This still happens for 3.12.22+ kernel.

rpi1 login: [83002.545319] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c0317eec
[83002.545319] 
[83002.559603] CPU: 0 PID: 2632 Comm: python Not tainted 3.12.22+ #691
[83002.567375] [<c0013ec0>] (unwind_backtrace+0x0/0xf0) from [<c0011284>] (show_stack+0x10/0x14)
[83002.578926] [<c0011284>] (show_stack+0x10/0x14) from [<c041c8f0>] (panic+0x94/0x1e4)
[83002.589672] [<c041c8f0>] (panic+0x94/0x1e4) from [<c001eac0>] (__stack_chk_fail+0x10/0x14)
[83002.601075] [<c001eac0>] (__stack_chk_fail+0x10/0x14) from [<c0317eec>] (i2c_smbus_xfer+0x58c/0x5a0)
[83002.613452] [<c0317eec>] (i2c_smbus_xfer+0x58c/0x5a0) from [<ffffffff>] (0xffffffff)
PANIC: stack-protector: Kernel stack is corrupted in: c0317eec
chrisb2 commented 10 years ago

I created a new Raspbian SD card and fully updated and upgrade it (3.12.22+ #691). My python program which uses smbus still fails, but does not freeze the RPi. In syslog is the following:

Jul 25 07:44:45 raspberrypi kernel: [ 1069.029025] Unable to handle kernel paging request at virtual address bf0f5cb0
Jul 25 07:44:45 raspberrypi kernel: [ 1069.054634] pgd = d20fc000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.058826] [bf0f5cb0] *pgd=1732f811, *pte=00000000, *ppte=00000000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.076008] Internal error: Oops: 7 [#1] PREEMPT ARM
Jul 25 07:44:46 raspberrypi kernel: [ 1069.082321] Modules linked in: i2c_dev snd_bcm2835 snd_soc_wm8804 snd_soc_pcm512x snd_soc_bcm2708_i2s regmap_mmio snd_soc_core regmap_spi snd_pcm_dmaengine snd_pcm snd_page_alloc evdev regmap_i2c snd_compress joydev snd_seq snd_timer snd_seq_device leds_gpio led_class snd spi_bcm2708 [last unloaded: i2c_bcm2708]
Jul 25 07:44:46 raspberrypi kernel: [ 1069.115754] CPU: 0 PID: 2750 Comm: python Not tainted 3.12.22+ #691
Jul 25 07:44:46 raspberrypi kernel: [ 1069.123452] task: d73acb00 ti: d7274000 task.ti: d7274000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.130313] PC is at i2c_smbus_xfer+0x14/0x5a0
Jul 25 07:44:46 raspberrypi kernel: [ 1069.136206] LR is at 0xbf0f5cac
Jul 25 07:44:46 raspberrypi kernel: [ 1069.140756] pc : [<c0317974>]    lr : [<bf0f5cac>]    psr: 80000013
Jul 25 07:44:46 raspberrypi kernel: [ 1069.140756] sp : d7275e00  ip : 00000001  fp : 01691880
Jul 25 07:44:46 raspberrypi kernel: [ 1069.155086] r10: bea4b208  r9 : d7274000  r8 : d7275ed2
Jul 25 07:44:46 raspberrypi kernel: [ 1069.161763] r7 : 00000001  r6 : 00009014  r5 : c05ca008  r4 : d200a400
Jul 25 07:44:46 raspberrypi kernel: [ 1069.169754] r3 : 00000001  r2 : 00000000  r1 : 00000030  r0 : d7325800
Jul 25 07:44:46 raspberrypi kernel: [ 1069.177762] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Jul 25 07:44:46 raspberrypi kernel: [ 1069.186417] Control: 00c5387d  Table: 120fc008  DAC: 00000015
Jul 25 07:44:46 raspberrypi kernel: [ 1069.193709] Process python (pid: 2750, stack limit = 0xd72741b8)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.201231] Stack: (0xd7275e00 to 0xd7276000)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.207067] 5e00: 59580c7d 0000002b 000186a0 00000000 d7275e48 00000000 d7274000 00000000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.218272] 5e20: 00000000 00000000 d7275e48 c00ef124 d7275e48 d7275e48 d7275e48 d7275e48
Jul 25 07:44:46 raspberrypi kernel: [ 1069.229565] 5e40: d7275e48 d7275e48 00989680 00000000 d080ee80 000000f8 00000001 00000000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.240875] 5e60: 00980080 00000000 0000000b d7275ea8 0000042c 00000000 ffffffff 00000000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.252245] 5e80: c05e1070 c990a646 000000f8 d200a400 c05ca008 d7274000 00000001 d7275ed2
Jul 25 07:44:46 raspberrypi kernel: [ 1069.263718] 5ea0: d7274000 bea4b208 01691880 bf0f05d8 00000000 00000001 d7275ed2 00000003
Jul 25 07:44:46 raspberrypi kernel: [ 1069.275338] 5ec0: 00000004 b6c70001 00000001 bea4b1e4 00ff0000 00000003 00000000 c003d8dc
Jul 25 07:44:46 raspberrypi kernel: [ 1069.287103] 5ee0: 00000000 00000000 cfe86b30 000000f8 00000001 c990a646 bea4b208 d200a400
Jul 25 07:44:46 raspberrypi kernel: [ 1069.299009] 5f00: 00000000 bea4b208 d73d79a8 bf0f0758 3fd3a862 00000000 ffffffff 00000720
Jul 25 07:44:46 raspberrypi kernel: [ 1069.310995] 5f20: d72ea0a0 c00ed874 d7274000 bea4b2a8 00000001 bea4b2a8 00000008 00000000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.323061] 5f40: 00000000 c00ee5c4 00000000 00000000 00000000 00000000 bea4b2a8 d7275f78
Jul 25 07:44:46 raspberrypi kernel: [ 1069.335194] 5f60: 00000000 00000000 00000720 00000004 00000000 bea4b208 d72ea0a0 d7274000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.347321] 5f80: 00000000 c00eddfc 01691880 00000000 b6c73518 b6ada000 00000000 00000036
Jul 25 07:44:46 raspberrypi kernel: [ 1069.359442] 5fa0: c000e328 c000e180 b6c73518 b6ada000 00000004 00000720 bea4b208 bea4b1e4
Jul 25 07:44:46 raspberrypi kernel: [ 1069.371562] 5fc0: b6c73518 b6ada000 00000000 00000036 016919c8 01630050 b6c7853c 01691880
Jul 25 07:44:46 raspberrypi kernel: [ 1069.383687] 5fe0: b6c7a738 bea4b1e0 b6acffd4 b6e00f0c 60000010 00000004 00000000 00000000
Jul 25 07:44:46 raspberrypi kernel: [ 1069.395850] [<c0317974>] (i2c_smbus_xfer+0x14/0x5a0) from [<bf0f05d8>] (i2cdev_ioctl_smbus+0x168/0x238 [i2c_dev])
Jul 25 07:44:46 raspberrypi kernel: [ 1069.410116] [<bf0f05d8>] (i2cdev_ioctl_smbus+0x168/0x238 [i2c_dev]) from [<bf0f0758>] (i2cdev_ioctl+0xb0/0x1f0 [i2c_dev])
Jul 25 07:44:46 raspberrypi kernel: [ 1069.425067] [<bf0f0758>] (i2cdev_ioctl+0xb0/0x1f0 [i2c_dev]) from [<c00ed874>] (do_vfs_ioctl+0x7c/0x5cc)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.438508] [<c00ed874>] (do_vfs_ioctl+0x7c/0x5cc) from [<c00eddfc>] (SyS_ioctl+0x38/0x60)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.450707] [<c00eddfc>] (SyS_ioctl+0x38/0x60) from [<c000e180>] (ret_fast_syscall+0x0/0x30)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.463050] Code: e24dd08c e590e008 e59f5578 e59f6578 (e59ee004) 
Jul 25 07:44:46 raspberrypi kernel: [ 1069.590262] bcm2708_i2c_init_pinmode(0,0)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.597531] bcm2708_i2c_init_pinmode(0,1)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.609757] bcm2708_i2c bcm2708_i2c.0: BSC0 Controller at 0x20205000 (irq 79) (baudrate 3814)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.626862] bcm2708_i2c_init_pinmode(1,2)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.632824] bcm2708_i2c_init_pinmode(1,3)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.648562] pcm512x 1-004c: Failed to reset device: -5
Jul 25 07:44:46 raspberrypi kernel: [ 1069.662390] pcm512x: probe of 1-004c failed with error -5
Jul 25 07:44:46 raspberrypi kernel: [ 1069.684614] bcm2708_i2c bcm2708_i2c.1: BSC1 Controller at 0x20804000 (irq 79) (baudrate 3814)
Jul 25 07:44:46 raspberrypi kernel: [ 1069.733211] ---[ end trace c30e666eb14c5813 ]---
chrisb2 commented 10 years ago

I have upgraded to the latest kernel (3.12.25+ #700) using sudo rpi-update, but the problem is still present.

P33M commented 10 years ago

There are two things going on here.

Given that they can be provoked by the same usage (i2c transfer) it points to just being different manifestations of the same bug.

@chrisb2 - you may need to add kdb=off to /boot/cmdline.txt

chrisb2 commented 10 years ago

I am setting the I2C baud rate to 3814, as my device does not communicate correctly at faster speeds. In syslog I think this is the logging for this. Are the errors of any significance to this problem?

Jul 26 00:19:43 raspberrypi kernel: [ 9542.305482] bcm2708_i2c bcm2708_i2c.0: BSC0 Controller at 0x20205000 (irq 79) (baudrate 3814)
Jul 26 00:19:43 raspberrypi kernel: [ 9542.362417] bcm2708_i2c_init_pinmode(1,2)
Jul 26 00:19:43 raspberrypi kernel: [ 9542.393602] bcm2708_i2c_init_pinmode(1,3)
Jul 26 00:19:43 raspberrypi kernel: [ 9542.432456] pcm512x 1-004c: Failed to reset device: -5
Jul 26 00:19:43 raspberrypi kernel: [ 9542.458576] pcm512x: probe of 1-004c failed with error -5
Jul 26 00:19:43 raspberrypi kernel: [ 9542.486322] bcm2708_i2c bcm2708_i2c.1: BSC1 Controller at 0x20804000 (irq 79) (baudrate 3814)
P33M commented 10 years ago

What i2c device are you communicating with and can you post the Python code that is being used?

chrisb2 commented 10 years ago

Device is: https://www.tindie.com/products/miceuz/i2c-soil-moisture-sensor/

Code is:

#!/usr/bin/env python

import smbus, time, sys

class Chirp:
  def __init__(self, bus=1, address=0x2f):
    self.bus_num = bus
    self.bus = smbus.SMBus(bus)
    self.address = address

  def read(self):
    ok = False
    val = None
    count = 0
    while not ok:
      try:
        # sometime reading raises an IOError, i don't know why.
        val = self.bus.read_byte(self.address)
        ok = True
      except IOError:
        time.sleep(0.1)
        count = count + 1
        if count > 5:
          raise
        pass
    return val

  def write(self, reg):
    count = 0
    ok = False
    while not ok:
      try:
        # sometime writing raises an IOError, i don't know why.
        self.bus.write_byte(self.address, reg)
        ok = True
      except IOError:
        time.sleep(0.1)
        count = count + 1
        if count > 5:
          raise

  def get_reg(self, reg):
    self.write(reg)
    time.sleep(0.1)

    b1 = self.read()
    b2 = self.read()

    # if the chrip has no data it sends
    # 0xff, use this to re-sync in case we loose values.
    t = self.read()
    while t != 0xff:
      t = self.read()
    return (b1 << 8) + b2

  def cap_sense(self):
    return self.get_reg(0)

  def temp(self):
    return self.get_reg(5)

  def light(self):
    self.write(3)
    time.sleep(1.5)
    return self.get_reg(4)

  def __repr__(self):
    return "<Chirp sensor on bus %d, addr %d>" % (self.bus_num, self.address)

if __name__ == "__main__":
  addr = 0x20
  if len(sys.argv) == 2:
    if sys.argv[1].startswith("0x"):
      addr = int(sys.argv[1], 16)
    else:
      addr = int(sys.argv[1])
  chirp = Chirp(1, addr)

  while True:
    print "cap", chirp.cap_sense()
    time.sleep(1)
    print
P33M commented 10 years ago

What's the marking on the top of the chip say?

I need to go find a datasheet for it.

chrisb2 commented 10 years ago

Schematic and BOM here:

https://github.com/Miceuz/PlantWateringAlarm/blob/release/sensor/schematics.png https://github.com/Miceuz/PlantWateringAlarm/blob/release/sensor/bom.csv

it looks like the chip is an ATtiny44A.

chrisb2 commented 10 years ago

My python script is now running OK (24 hours running). I was altering I2C baud rate with the following command after logging in with putty:

sudo modprobe -r i2c_bcm2708 && sudo modprobe i2c_bcm2708 baudrate=3814

I have now change to setting it at boot time by altering /etc/modprobe.d/i2c.conf to:

options i2c_bcm2708 baudrate=3814

P33M commented 10 years ago

If you repeatedly remove/reinsert the driver, prior to running the sensor code, does the crash take less time to occur?

One thing to do would be to run the script for one loop, then remove and reinsert the driver. Repeat this to see if the problem is worse/better.

Ruffio commented 8 years ago

@popcornmix please consider to close this issue, as there have not been any response to question asked 27 Jul 2014