open-power / skiboot

OPAL boot and runtime firmware for POWER
Apache License 2.0
100 stars 134 forks source link

Spurious I2C timeouts under heavy skiboot ipmi/sensor load #88

Open rlippert opened 7 years ago

rlippert commented 7 years ago

While running a loop reading DIMM sensors via opal-i2c driver and other sensors via in-band IPMI it is easy to trigger I2C timeouts and IPMI timeouts by attempting to read /dev/mtd0:

<start loop reading DIMM sensors via I2C>
<start loop reading IPMI sensors via ipmitool>
fden257:~# dd if=/dev/mtd0 of=/dev/null bs=64k
[ 8106.436800] ipmi-powernv: no current message?
[ 8106.491264] ipmi-powernv: no current message?
[ 8106.570577] ipmi-powernv: no current message?
[ 8172.300258400,3] I2C: Request timeout !
[ 8172.308479344,3] I2C: Chip 00000008 Eng. 3 Port 1--
 xscom_base=0x00000000000a3000  state=2 bytes_sent=0
[ 8172.309013104,3] I2C: Request info--
 addr=0x001a    offset_bytes=1  offset=5        len=2
[ 8172.310033424,3] I2C:  start_time=000003ce3f5dcfe0 end_time=000003ce3f8b9000 (duration=00000000002dc020)
[ 8172.310577728,3] I2C: Register dump--
    cmd:0xc134000101000000      mode:0x004d040001000000 stat:0x0101970001040000
  estat:0x080055d701000000      intm:0x0000ff8001000000 intc:0x0000014601000000
[ 8172.332008800,3] I2C: Request timeout !
[ 8172.337150288,3] I2C: Chip 00000008 Eng. 3 Port 1--
 xscom_base=0x00000000000a3000  state=2 bytes_sent=0
[ 8172.342784384,3] I2C: Request info--
 addr=0x001e    offset_bytes=1  offset=5        len=2
[ 8172.351517872,3] I2C:  start_time=000003ce410656a0 end_time=000003ce41700b00 (duration=000000000069b460)
[ 8172.353472064,3] I2C: Register dump--
    cmd:0xc13c000101000000      mode:0x004d040001000000 stat:0x0101970001040000
  estat:0x080055d701000000      intm:0x0000ff8001000000 intc:0x0000014601000000
[ 8172.372952480,3] I2C: Request timeout !
[ 8172.375033552,3] I2C: Chip 00000008 Eng. 3 Port 0--
 xscom_base=0x00000000000a3000  state=3 bytes_sent=0
[ 8172.375327792,3] I2C: Request info--
 addr=0x001a    offset_bytes=1  offset=5        len=2
[ 8172.377790368,3] I2C:  start_time=000003ce435f0650 end_time=000003ce43e0cc10 (duration=000000000081c5c0)
[ 8172.377864048,3] I2C: Register dump--
    cmd:0xd135000202000000      mode:0x004d000002000000 stat:0x02018c0202040000
  estat:0x0800dd1702000000      intm:0x0000ff8002000000 intc:0x0000022a02000000
[ 8172.402188816,3] I2C: Request timeout !
[ 8172.402229248,3] I2C: Chip 00000008 Eng. 3 Port 0--
 xscom_base=0x00000000000a3000  state=3 bytes_sent=0
[ 8172.402299712,3] I2C: Request info--
 addr=0x001e    offset_bytes=1  offset=5        len=2
[ 8172.402358384,3] I2C:  start_time=000003ce44d2e6b0 end_time=000003ce459ee370 (duration=0000000000cbfcc0)
[ 8172.402446352,3] I2C: Register dump--
    cmd:0xd13d000202000000      mode:0x004d000002000000 stat:0x02018c0202040000
  estat:0x0800dd1702000000      intm:0x0000ff8002000000 intc:0x0000022a02000000
[ 8108.343266] ipmi-powernv: no current message?
[ 8108.399516] ipmi-powernv: no current message?
[ 8109.879026] ipmi-powernv: no current message?
[ 8110.158609] ipmi-powernv: no current message?
[ 8110.241578] ipmi-powernv: no current message?
[ 8111.716242] ipmi-powernv: no current message?
[ 8112.025161] ipmi-powernv: no current message?
[ 8113.484807] ipmi-powernv: no current message?
[ 8117.107408] ipmi-powernv: no current message?
[ 8117.212644] ipmi-powernv: no current message?
[ 8118.987664] ipmi-powernv: no current message?
[ 8119.013870] ipmi-powernv: no current message?
[ 8120.355460] ipmi-powernv: no current message?
[ 8121.883551] ipmi-powernv: no current message?
[ 8122.018948] ipmi-powernv: no current message?
[ 8123.423452] ipmi-powernv: no current message?
[ 8125.246948] ipmi-powernv: no current message?
[ 8128.787041] ipmi-powernv: no current message?
[ 8128.866536] ipmi-powernv: no current message?
[ 8130.543608] ipmi-powernv: no current message?
[ 8130.650066] ipmi-powernv: no current message?
[ 8130.754808] ipmi-powernv: no current message?
[ 8130.807827] ipmi-powernv: no current message?
[ 8132.345018] ipmi-powernv: no current message?
[ 8132.423644] ipmi-powernv: no current message?
[ 8133.990729] ipmi-powernv: no current message?
[ 8134.060351] ipmi-powernv: no current message?
[ 8136.566715] ipmi-powernv: no current message?
[ 8139.184069] ipmi-powernv: no current message?
[ 8139.238552] ipmi-powernv: no current message?
[ 8140.979470] ipmi-powernv: no current message?
[ 8146.350650] ipmi-powernv: no current message?
[ 8146.430811] ipmi-powernv: no current message?
[ 8148.107735] ipmi-powernv: no current message?
[ 8150.574580] ipmi-powernv: no current message?
[ 8150.630233] ipmi-powernv: no current message?
[ 8150.684337] ipmi-powernv: no current message?
1024+0 records in
1024+0 records out
67108864 bytes (67 MB) copied, 44.6075 s, 1.5 MB/s

The I2C timeouts look spurious since they seem to be due to lack of timely processing by skiboot and not a hardware issue.

gery-toulouse commented 6 years ago

Please retest with a recent version.