turing-machines / BMC-Firmware

Turing-pi BMC firmware
http://turingpi.com
GNU General Public License v2.0
246 stars 30 forks source link

problem with jumbo frames / MTU #179

Open grubFX opened 10 months ago

grubFX commented 10 months ago

Describe the bug The BMC on my TuringPI2 becomes unresponsive as soon as jumbo frames are enabled on my network. I kept seeing ~10 to a maximum of ~40 seconds of pings that would go through before I can't see it onthe network anymore. On some tries I managed to login, but it still crashed after roughly the same amount of time.

To Reproduce Steps to reproduce the behavior:

  1. enable jumbro frames on your network
  2. ping your turingPI BMC IP
  3. (or if you're fast enough: try to log into the BMC)
  4. watch it become unresponsive after ~10-40 seconds (in my case)

Expected behavior It would ba nice if the BMC also worked on a network with jumbo frames enabled.

Versions tested on bmc version=2.0.5

Additional context As an easy workaround I disabled jumbo frames on my network and rebooted the TuringPi2 board, which worked. But as soon as I turn on jumbo frames on my network again the BMC crashes.

peter64m commented 3 months ago

I can confirm that I have the same problem. I can reproduce the panic in the BMC with unicast or broadcast ping using a 9000 byte buffer and no fragmentation. Windows: PS C:\Users\peter> ping -f -l 9000 turingpi.lan

It has happened every time

I am building v2.1.0-RC2 to compare the behaviour. Network Interfaces file:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp
  pre-up /etc/network/nfs_check
  wait-delay 15
  hostname $(hostname)

Console output after receiving a jumbo ping:

 _____ _   _ ____  ___ _   _  ____
|_   _| | | |  _ \|_ _| \ | |/ ___|
  | | | | | | |_) || ||  \| | |  _
  | | | |_| |  _ < | || |\  | |_| |
  |_|  \___/|_| \_\___|_| \_|\____|

Welcome to Turing Pi
turingpi login: 2024-08-29T10:25:29.395Z INFO  [bmcd] Turing Pi 2 BMC Daemon v2.0.5
[  188.690107] skbuff: skb_over_panic: text:0913898d len:9042 put:9042 head:f1867692 data:14d1e50c tail:0xc5d8fc94 end:0xc5d8e180 dev:eth0
[  188.703817] ------------[ cut here ]------------
[  188.708998] kernel BUG at net/core/skbuff.c:109!
[  188.714174] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[  188.720714] Modules linked in:
[  188.724143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.61 #1
[  188.730780] Hardware name: Generic DT based system
[  188.736162] PC is at skb_panic+0x40/0x4c
[  188.740560] LR is at skb_panic+0x40/0x4c
[  188.744957] pc : [<c04eb2d8>]    lr : [<c04eb2d8>]    psr: 60070113
[  188.751985] sp : c0901e28  ip : 00000001  fp : 000000ec
[  188.757843] r10: 00000000  r9 : 00000040  r8 : 00002352
[  188.763701] r7 : c5d8e180  r6 : c5d8fc94  r5 : c5d8d942  r4 : c5d8d900
[  188.771024] r3 : 00000000  r2 : 00000000  r1 : 00000004  r0 : 0000007b
[  188.778347] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  188.786350] Control: 10c5387d  Table: 45f0c06a  DAC: 00000051
[  188.792798] Process swapper/0 (pid: 0, stack limit = 0xb719506f)
[  188.799532] Stack: (0xc0901e28 to 0xc0902000)
[  188.804421] 1e20:                   00002352 c5d8d900 c5d8d942 c5d8fc94 c5d8e180 c6961000
[  188.813599] 1e40: 00000005 c04e4088 c6961638 c5ce2c80 c7447ec0 c0419a84 c6961738 00000040
[  188.822776] 1e60: c0901e94 c6961638 c7ea74c0 00000001 c083e4c0 07669000 c0901e94 0000012c
[  188.831954] 1e80: c0901e9c c04fab48 00000040 ffffd485 c074c3eb c0901e94 c0901e94 c0901e9c
[  188.841131] 1ea0: c0901e9c c73048aa 00000029 c090208c 0000000c 00000008 c0901ed0 c0900000
[  188.850312] 1ec0: c083d780 c08372fc 00000101 c0102ad8 10c5387d c0157fd0 00200002 0000000a
[  188.859488] 1ee0: ffffd484 00000004 40000003 c0902080 c074c3eb c083d780 00000000 c7020400
[  188.868665] 1f00: 00000001 c082d238 c0900000 10c5387d 00000000 c011db08 00000000 c01577c8
[  188.877843] 1f20: c0901f48 c8802000 c0904138 c0901f7c c082d238 c0361ff4 c0107fd8 60070013
[  188.887020] 1f40: ffffffff c01021cc 0006c898 00000000 0006c898 c0114c40 c0900000 c0903de8
[  188.896197] 1f60: 00000001 c0903e24 c082d238 410fc075 10c5387d 00000000 00000064 c0901f98
[  188.905375] 1f80: c0107fe8 c0107fd8 60070013 ffffffff 00000051 00000000 c0900000 c013e684
[  188.914553] 1fa0: 000000cf c0941000 c0903dc0 00000000 c082d238 c013e974 c0941048 c0800ca8
[  188.923730] 1fc0: ffffffff ffffffff 00000000 c0800568 00000000 c082d238 c7354c97 00000000
[  188.932907] 1fe0: c0800330 00000051 10c0387d 00000000 47d33000 00000000 00000000 00000000
[  188.942092] [<c04eb2d8>] (skb_panic) from [<c04e4088>] (skb_put+0x44/0x50)
[  188.949812] [<c04e4088>] (skb_put) from [<c0419a84>] (geth_poll+0x10c/0x190)
[  188.957725] [<c0419a84>] (geth_poll) from [<c04fab48>] (net_rx_action+0x118/0x30c)
[  188.966225] [<c04fab48>] (net_rx_action) from [<c0102ad8>] (__do_softirq+0x1f0/0x278)
[  188.975018] [<c0102ad8>] (__do_softirq) from [<c011db08>] (irq_exit+0x78/0xd0)
[  188.983129] [<c011db08>] (irq_exit) from [<c01577c8>] (__handle_domain_irq+0x74/0xa0)
[  188.991921] [<c01577c8>] (__handle_domain_irq) from [<c0361ff4>] (gic_handle_irq+0x44/0x74)
[  189.001294] [<c0361ff4>] (gic_handle_irq) from [<c01021cc>] (__irq_svc+0x6c/0xa8)
[  189.009687] Exception stack(0xc0901f48 to 0xc0901f90)
[  189.015354] 1f40:                   0006c898 00000000 0006c898 c0114c40 c0900000 c0903de8
[  189.024531] 1f60: 00000001 c0903e24 c082d238 410fc075 10c5387d 00000000 00000064 c0901f98
[  189.033705] 1f80: c0107fe8 c0107fd8 60070013 ffffffff
[  189.039377] [<c01021cc>] (__irq_svc) from [<c0107fd8>] (arch_cpu_idle+0x1c/0x38)
[  189.047681] [<c0107fd8>] (arch_cpu_idle) from [<c013e684>] (do_idle+0xd0/0x120)
[  189.055886] [<c013e684>] (do_idle) from [<c013e974>] (cpu_startup_entry+0x18/0x1c)
[  189.064387] [<c013e974>] (cpu_startup_entry) from [<c0800ca8>] (start_kernel+0x38c/0x438)
[  189.073566] Code: e58d3014 e5903050 e59f0008 ebf1ad82 (e7f001f2)
[  189.080405] ---[ end trace bdf74f70f70f33c1 ]---
[  189.085586] Kernel panic - not syncing: Fatal exception in interrupt
[  189.092728] CPU1: stopping
[  189.095767] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D           5.4.61 #1
[  189.103963] Hardware name: Generic DT based system
[  189.109348] [<c010e390>] (unwind_backtrace) from [<c010a850>] (show_stack+0x10/0x14)
[  189.118044] [<c010a850>] (show_stack) from [<c05e432c>] (dump_stack+0x78/0x94)
[  189.126154] [<c05e432c>] (dump_stack) from [<c010c6d0>] (handle_IPI+0xcc/0x168)
[  189.134360] [<c010c6d0>] (handle_IPI) from [<c036201c>] (gic_handle_irq+0x6c/0x74)
[  189.142857] [<c036201c>] (gic_handle_irq) from [<c01021cc>] (__irq_svc+0x6c/0xa8)
[  189.151250] Exception stack(0xc7097f80 to 0xc7097fc8)
[  189.156919] 7f80: 0003eb70 00000000 0003eb70 c0114c40 c7096000 c0903de8 00000002 c0903e24
[  189.166096] 7fa0: 4000406a 410fc075 00000000 00000000 00000000 c7097fd0 c0107fe8 c0107fd8
[  189.175268] 7fc0: 600d0013 ffffffff
[  189.179187] [<c01021cc>] (__irq_svc) from [<c0107fd8>] (arch_cpu_idle+0x1c/0x38)
[  189.187490] [<c0107fd8>] (arch_cpu_idle) from [<c013e684>] (do_idle+0xd0/0x120)
[  189.195694] [<c013e684>] (do_idle) from [<c013e974>] (cpu_startup_entry+0x18/0x1c)
[  189.204188] [<c013e974>] (cpu_startup_entry) from [<40102bec>] (0x40102bec)
[  189.212004] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---