turing-machines / BMC-Firmware

Turing-pi BMC firmware
http://turingpi.com
GNU General Public License v2.0
226 stars 29 forks source link

problem with jumbo frames / MTU #179

Open grubFX opened 8 months ago

grubFX commented 8 months ago

Describe the bug The BMC on my TuringPI2 becomes unresponsive as soon as jumbo frames are enabled on my network. I kept seeing ~10 to a maximum of ~40 seconds of pings that would go through before I can't see it onthe network anymore. On some tries I managed to login, but it still crashed after roughly the same amount of time.

To Reproduce Steps to reproduce the behavior:

  1. enable jumbro frames on your network
  2. ping your turingPI BMC IP
  3. (or if you're fast enough: try to log into the BMC)
  4. watch it become unresponsive after ~10-40 seconds (in my case)

Expected behavior It would ba nice if the BMC also worked on a network with jumbo frames enabled.

Versions tested on bmc version=2.0.5

Additional context As an easy workaround I disabled jumbo frames on my network and rebooted the TuringPi2 board, which worked. But as soon as I turn on jumbo frames on my network again the BMC crashes.

peter64m commented 1 month ago

I can confirm that I have the same problem. I can reproduce the panic in the BMC with unicast or broadcast ping using a 9000 byte buffer and no fragmentation. Windows: PS C:\Users\peter> ping -f -l 9000 turingpi.lan

It has happened every time

I am building v2.1.0-RC2 to compare the behaviour. Network Interfaces file:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp
  pre-up /etc/network/nfs_check
  wait-delay 15
  hostname $(hostname)

Console output after receiving a jumbo ping:

 _____ _   _ ____  ___ _   _  ____
|_   _| | | |  _ \|_ _| \ | |/ ___|
  | | | | | | |_) || ||  \| | |  _
  | | | |_| |  _ < | || |\  | |_| |
  |_|  \___/|_| \_\___|_| \_|\____|

Welcome to Turing Pi
turingpi login: 2024-08-29T10:25:29.395Z INFO  [bmcd] Turing Pi 2 BMC Daemon v2.0.5
[  188.690107] skbuff: skb_over_panic: text:0913898d len:9042 put:9042 head:f1867692 data:14d1e50c tail:0xc5d8fc94 end:0xc5d8e180 dev:eth0
[  188.703817] ------------[ cut here ]------------
[  188.708998] kernel BUG at net/core/skbuff.c:109!
[  188.714174] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[  188.720714] Modules linked in:
[  188.724143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.61 #1
[  188.730780] Hardware name: Generic DT based system
[  188.736162] PC is at skb_panic+0x40/0x4c
[  188.740560] LR is at skb_panic+0x40/0x4c
[  188.744957] pc : [<c04eb2d8>]    lr : [<c04eb2d8>]    psr: 60070113
[  188.751985] sp : c0901e28  ip : 00000001  fp : 000000ec
[  188.757843] r10: 00000000  r9 : 00000040  r8 : 00002352
[  188.763701] r7 : c5d8e180  r6 : c5d8fc94  r5 : c5d8d942  r4 : c5d8d900
[  188.771024] r3 : 00000000  r2 : 00000000  r1 : 00000004  r0 : 0000007b
[  188.778347] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  188.786350] Control: 10c5387d  Table: 45f0c06a  DAC: 00000051
[  188.792798] Process swapper/0 (pid: 0, stack limit = 0xb719506f)
[  188.799532] Stack: (0xc0901e28 to 0xc0902000)
[  188.804421] 1e20:                   00002352 c5d8d900 c5d8d942 c5d8fc94 c5d8e180 c6961000
[  188.813599] 1e40: 00000005 c04e4088 c6961638 c5ce2c80 c7447ec0 c0419a84 c6961738 00000040
[  188.822776] 1e60: c0901e94 c6961638 c7ea74c0 00000001 c083e4c0 07669000 c0901e94 0000012c
[  188.831954] 1e80: c0901e9c c04fab48 00000040 ffffd485 c074c3eb c0901e94 c0901e94 c0901e9c
[  188.841131] 1ea0: c0901e9c c73048aa 00000029 c090208c 0000000c 00000008 c0901ed0 c0900000
[  188.850312] 1ec0: c083d780 c08372fc 00000101 c0102ad8 10c5387d c0157fd0 00200002 0000000a
[  188.859488] 1ee0: ffffd484 00000004 40000003 c0902080 c074c3eb c083d780 00000000 c7020400
[  188.868665] 1f00: 00000001 c082d238 c0900000 10c5387d 00000000 c011db08 00000000 c01577c8
[  188.877843] 1f20: c0901f48 c8802000 c0904138 c0901f7c c082d238 c0361ff4 c0107fd8 60070013
[  188.887020] 1f40: ffffffff c01021cc 0006c898 00000000 0006c898 c0114c40 c0900000 c0903de8
[  188.896197] 1f60: 00000001 c0903e24 c082d238 410fc075 10c5387d 00000000 00000064 c0901f98
[  188.905375] 1f80: c0107fe8 c0107fd8 60070013 ffffffff 00000051 00000000 c0900000 c013e684
[  188.914553] 1fa0: 000000cf c0941000 c0903dc0 00000000 c082d238 c013e974 c0941048 c0800ca8
[  188.923730] 1fc0: ffffffff ffffffff 00000000 c0800568 00000000 c082d238 c7354c97 00000000
[  188.932907] 1fe0: c0800330 00000051 10c0387d 00000000 47d33000 00000000 00000000 00000000
[  188.942092] [<c04eb2d8>] (skb_panic) from [<c04e4088>] (skb_put+0x44/0x50)
[  188.949812] [<c04e4088>] (skb_put) from [<c0419a84>] (geth_poll+0x10c/0x190)
[  188.957725] [<c0419a84>] (geth_poll) from [<c04fab48>] (net_rx_action+0x118/0x30c)
[  188.966225] [<c04fab48>] (net_rx_action) from [<c0102ad8>] (__do_softirq+0x1f0/0x278)
[  188.975018] [<c0102ad8>] (__do_softirq) from [<c011db08>] (irq_exit+0x78/0xd0)
[  188.983129] [<c011db08>] (irq_exit) from [<c01577c8>] (__handle_domain_irq+0x74/0xa0)
[  188.991921] [<c01577c8>] (__handle_domain_irq) from [<c0361ff4>] (gic_handle_irq+0x44/0x74)
[  189.001294] [<c0361ff4>] (gic_handle_irq) from [<c01021cc>] (__irq_svc+0x6c/0xa8)
[  189.009687] Exception stack(0xc0901f48 to 0xc0901f90)
[  189.015354] 1f40:                   0006c898 00000000 0006c898 c0114c40 c0900000 c0903de8
[  189.024531] 1f60: 00000001 c0903e24 c082d238 410fc075 10c5387d 00000000 00000064 c0901f98
[  189.033705] 1f80: c0107fe8 c0107fd8 60070013 ffffffff
[  189.039377] [<c01021cc>] (__irq_svc) from [<c0107fd8>] (arch_cpu_idle+0x1c/0x38)
[  189.047681] [<c0107fd8>] (arch_cpu_idle) from [<c013e684>] (do_idle+0xd0/0x120)
[  189.055886] [<c013e684>] (do_idle) from [<c013e974>] (cpu_startup_entry+0x18/0x1c)
[  189.064387] [<c013e974>] (cpu_startup_entry) from [<c0800ca8>] (start_kernel+0x38c/0x438)
[  189.073566] Code: e58d3014 e5903050 e59f0008 ebf1ad82 (e7f001f2)
[  189.080405] ---[ end trace bdf74f70f70f33c1 ]---
[  189.085586] Kernel panic - not syncing: Fatal exception in interrupt
[  189.092728] CPU1: stopping
[  189.095767] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D           5.4.61 #1
[  189.103963] Hardware name: Generic DT based system
[  189.109348] [<c010e390>] (unwind_backtrace) from [<c010a850>] (show_stack+0x10/0x14)
[  189.118044] [<c010a850>] (show_stack) from [<c05e432c>] (dump_stack+0x78/0x94)
[  189.126154] [<c05e432c>] (dump_stack) from [<c010c6d0>] (handle_IPI+0xcc/0x168)
[  189.134360] [<c010c6d0>] (handle_IPI) from [<c036201c>] (gic_handle_irq+0x6c/0x74)
[  189.142857] [<c036201c>] (gic_handle_irq) from [<c01021cc>] (__irq_svc+0x6c/0xa8)
[  189.151250] Exception stack(0xc7097f80 to 0xc7097fc8)
[  189.156919] 7f80: 0003eb70 00000000 0003eb70 c0114c40 c7096000 c0903de8 00000002 c0903e24
[  189.166096] 7fa0: 4000406a 410fc075 00000000 00000000 00000000 c7097fd0 c0107fe8 c0107fd8
[  189.175268] 7fc0: 600d0013 ffffffff
[  189.179187] [<c01021cc>] (__irq_svc) from [<c0107fd8>] (arch_cpu_idle+0x1c/0x38)
[  189.187490] [<c0107fd8>] (arch_cpu_idle) from [<c013e684>] (do_idle+0xd0/0x120)
[  189.195694] [<c013e684>] (do_idle) from [<c013e974>] (cpu_startup_entry+0x18/0x1c)
[  189.204188] [<c013e974>] (cpu_startup_entry) from [<40102bec>] (0x40102bec)
[  189.212004] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---