openrisc / linux

Linux kernel source tree
Other
29 stars 14 forks source link

Lockups when running network stack #10

Closed stffrdhrn closed 4 years ago

stffrdhrn commented 4 years ago

When just running linux with every once in a while we get complete lockups with no errors. Running in qemu in the monitor I can see:

See dumps below.

Note, the LR points to :

c0007550 <_external_irq_handler>:
c0007550:       d4 01 10 08     l.sw 8(r1),r2
c0007554:       d4 01 18 0c     l.sw 12(r1),r3
c0007558:       d4 01 28 14     l.sw 20(r1),r5
...
c00075d8:       48 00 40 00     l.jalr r8
c00075dc:       15 00 00 00     l.nop 0x0
c00075e0:       00 00 03 5a     l.j c0008348 <_ret_from_exception>
c00075e4:       15 00 00 00      l.nop 0x0
c039bfd8 <do_IRQ>:
c039bfd8:       9c 21 ff f8     l.addi r1,r1,-8
c039bfdc:       d4 01 10 00     l.sw 0(r1),r2
c039bfe0:       d4 01 48 04     l.sw 4(r1),r9
c039bfe4:       1a 20 c0 8f     l.movhi r17,0xc08f
c039bfe8:       86 31 cb fc     l.lwz r17,-13316(r17)
c039bfec:       48 00 88 00     l.jalr r17
c039bff0:       9c 41 00 08     l.addi r2,r1,8
c039bff4:       85 21 00 04     l.lwz r9,4(r1)
c039bff8:       84 41 00 00     l.lwz r2,0(r1)
c039bffc:       44 00 48 00     l.jr r9
c039c000:       9c 21 00 08     l.addi r1,r1,8

c039c004 <__irqentry_text_end>:
c039c004:       00 00 00 00     l.j c039c004 <__irqentry_text_end>
PC=c039c004
R00=00000000 R01=34d08aec R02=c15b1804 R03=00000000
R04=ffffffff R05=00000000 R06=00000012 R07=0000000e
R08=0000001a R09=c00075e0 R10=c15b0000 R11=00000000
R12=0000001b R13=00000010 R14=c1076840 R15=00000005
R16=c1076840 R17=00000000 R18=c1f69758 R19=00000000
R20=00ca4800 R21=00000000 R22=00000428 R23=00000003
R24=01511e00 R25=ce63f7b2 R26=0000042c R27=22bdcb7b
R28=00000005 R29=3416c000 R30=ffffffff R31=c62b0000

A second later.

(qemu) info registers 
PC=c039c004
R00=00000000 R01=b96c226c R02=c15b1804 R03=00000000
R04=ffffffff R05=00000000 R06=00000012 R07=0000000e
R08=0000001a R09=c00075e0 R10=c15b0000 R11=00000000
R12=0000001b R13=00000010 R14=c1076840 R15=00000005
R16=c1076840 R17=00000000 R18=c1f69758 R19=00000000
R20=00ca4800 R21=00000000 R22=00000428 R23=00000003
R24=01511e00 R25=ce63f7b2 R26=0000042c R27=22bdcb7b
R28=00000005 R29=3416c000 R30=ffffffff R31=c62b0000
stffrdhrn commented 4 years ago

It looks like somehow this is stuck looping in do_IRQ

stffrdhrn commented 4 years ago

I turned on a bunch of debugging options in the kernel and after a while got:

/ # [16471.390000] BUG: failure at net/core/skbuff.c:2805/skb_copy_and_csum_bits()!
[16471.390000] Kernel panic - not syncing: BUG!
[16471.390000] ---[ end Kernel panic - not syncing: BUG! ]---

Its not clear if its related.

stffrdhrn commented 4 years ago

After increasing memory to 512mb it seems to help. Closing.