guysoft / OctoPi

Scripts to build OctoPi, a Raspberry PI distro for controlling 3D printers over the web
GNU General Public License v3.0
2.45k stars 366 forks source link

Raspberry Pi Zero 2 W "freezing" when using Jumbo Frames #795

Open SkyBeam opened 1 year ago

SkyBeam commented 1 year ago

What were you doing?

  1. Use a LAN client and Switches with jumbo frames (e.g. MTU 9000)
  2. Try to fetch the OctoPrint web interface

What did you expect to happen?

Web interface loading.

What happened instead?

OctoPi stopping to respond on network interface. No response. Not even ICMP ping is responded any more. After a while the situation seems to recover slightly and OctoPi starting to respond on ICMP ping again but keeps "dying" when sending large frames again.

Did the same happen when running OctoPrint in safe mode?

Does not matter in which mode. The issue also happens when trying SSH logon while using jumbo frames on LAN interface. It also happens regardless of using an USB LAN or the built-in WLAN interface.

It almost seems like jumbo frames are crashing the network stack and causing it to be restarted.

Version of OctoPi

Used latest official 0.18.0. Tried latest nightly 64-bit (as there seems to be no official 64-bit build available yet) where I could reproduce the issue as well (http://unofficialpi.org/Distros/OctoPi/nightly-arm64/2022-09-26_2022-04-04-octopi-bullseye-arm64-lite-1.0.0.zip).

Printer model & used firmware incl. version

Not relevant, happens even when there is no printer connected.

System used: Raspberry Pi Zero 2 W, 4GB, 32GB/64GB microSD cards, with and without Ethernet hat.

On arm64 build I have beeen able to get some dmesg traces:

[Mon Sep 26 18:43:22 2022] ------------[ cut here ]------------ [Mon Sep 26 18:43:22 2022] NETDEV WATCHDOG: eth0 (r8152): transmit queue 0 timed out [Mon Sep 26 18:43:22 2022] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:478 dev_watchdog+0x398/0x3a0 [Mon Sep 26 18:43:22 2022] Modules linked in: nft_counter xt_DSCP xt_tcpudp nft_compat nf_tables nfnetlink cmac algif_hash aes_arm64 algif_skcipher af_alg bnep hci_uart btbcm bluetooth ecdh_generic ecc 8021q garp stp llc snd_soc_hdmi_codec vc4 cec brcmfmac brcmutil drm_kms_helper snd_soc_core snd_compress snd_pcm_dmaengine syscopyarea sysfillrect sysimgblt fb_sys_fops cfg80211 raspberrypi_hwmon rfkill bcm2835_codec(C) bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) v4l2_mem2mem videobuf2_dma_contig videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 i2c_bcm2835 snd_bcm2835(C) videobuf2_common snd_pcm snd_timer videodev snd mc vc_sm_cma(C) uio_pdrv_genirq uio fuse drm drm_panel_orientation_quirks backlight ip_tables x_tables ipv6 [Mon Sep 26 18:43:22 2022] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G C 5.15.32-v8+ #1538 [Mon Sep 26 18:43:22 2022] Hardware name: Raspberry Pi Zero 2 W Rev 1.0 (DT) [Mon Sep 26 18:43:22 2022] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [Mon Sep 26 18:43:22 2022] pc : dev_watchdog+0x398/0x3a0 [Mon Sep 26 18:43:22 2022] lr : dev_watchdog+0x398/0x3a0 [Mon Sep 26 18:43:22 2022] sp : ffffffc008003d40 [Mon Sep 26 18:43:22 2022] x29: ffffffc008003d40 x28: ffffff8003588080 x27: 0000000000000004 [Mon Sep 26 18:43:22 2022] x26: 0000000000000140 x25: 00000000ffffffff x24: 0000000000000000 [Mon Sep 26 18:43:22 2022] x23: ffffffc009316000 x22: ffffff80035583dc x21: ffffff8003558000 [Mon Sep 26 18:43:22 2022] x20: ffffff8003558480 x19: 0000000000000000 x18: 0000000000000000 [Mon Sep 26 18:43:22 2022] x17: ffffffc00ee4d000 x16: ffffffc008004000 x15: ffffffffffffffff [Mon Sep 26 18:43:22 2022] x14: ffffffc008e8a9c8 x13: 74756f2064656d69 x12: ffffffc0093a6670 [Mon Sep 26 18:43:22 2022] x11: 0000000000000003 x10: ffffffc00938e630 x9 : ffffffc0080ec768 [Mon Sep 26 18:43:22 2022] x8 : 0000000000017fe8 x7 : 0000000000000003 x6 : 0000000000000000 [Mon Sep 26 18:43:22 2022] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000103 [Mon Sep 26 18:43:22 2022] x2 : 0000000000000102 x1 : 788dcdc6a0018a00 x0 : 0000000000000000 [Mon Sep 26 18:43:22 2022] Call trace: [Mon Sep 26 18:43:22 2022] dev_watchdog+0x398/0x3a0 [Mon Sep 26 18:43:22 2022] call_timer_fn+0x38/0x1d8 [Mon Sep 26 18:43:22 2022] run_timer_softirq+0x274/0x508 [Mon Sep 26 18:43:22 2022] __do_softirq+0x1a8/0x4ec [Mon Sep 26 18:43:22 2022] irq_exit+0x110/0x150 [Mon Sep 26 18:43:22 2022] handle_domain_irq+0x9c/0xe0 [Mon Sep 26 18:43:22 2022] bcm2836_arm_irqchip_handle_irq+0x68/0x80 [Mon Sep 26 18:43:22 2022] call_on_irq_stack+0x28/0x54 [Mon Sep 26 18:43:22 2022] do_interrupt_handler+0x60/0x70 [Mon Sep 26 18:43:22 2022] el1_interrupt+0x30/0x78 [Mon Sep 26 18:43:22 2022] el1h_64_irq_handler+0x18/0x28 [Mon Sep 26 18:43:22 2022] el1h_64_irq+0x78/0x7c [Mon Sep 26 18:43:22 2022] arch_cpu_idle+0x18/0x28 [Mon Sep 26 18:43:22 2022] default_idle_call+0x54/0x19c [Mon Sep 26 18:43:22 2022] do_idle+0x254/0x268 [Mon Sep 26 18:43:22 2022] cpu_startup_entry+0x30/0x80 [Mon Sep 26 18:43:22 2022] rest_init+0xe4/0xf8 [Mon Sep 26 18:43:22 2022] arch_call_rest_init+0x18/0x24 [Mon Sep 26 18:43:22 2022] start_kernel+0x6c0/0x6f8 [Mon Sep 26 18:43:22 2022] __primary_switched+0xa0/0xa8 [Mon Sep 26 18:43:22 2022] ---[ end trace af5ef9af58778b3a ]--- [Mon Sep 26 18:43:22 2022] r8152 1-1.4:1.0 eth0: Tx timeout [Mon Sep 26 18:43:22 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:22 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:22 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:22 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:24 2022] usb 1-1.4: reset high-speed USB device number 3 using dwc_otg [Mon Sep 26 18:43:37 2022] r8152 1-1.4:1.0 eth0: Tx timeout [Mon Sep 26 18:43:37 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:37 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:37 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:40 2022] usb 1-1.4: reset high-speed USB device number 3 using dwc_otg [Mon Sep 26 18:43:59 2022] r8152 1-1.4:1.0 eth0: Tx timeout [Mon Sep 26 18:43:59 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:59 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:59 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:43:59 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:02 2022] usb 1-1.4: reset high-speed USB device number 3 using dwc_otg [Mon Sep 26 18:44:20 2022] r8152 1-1.4:1.0 eth0: Tx timeout [Mon Sep 26 18:44:20 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:20 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:20 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:20 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:23 2022] usb 1-1.4: reset high-speed USB device number 3 using dwc_otg [Mon Sep 26 18:44:36 2022] r8152 1-1.4:1.0 eth0: Tx timeout [Mon Sep 26 18:44:36 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:36 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:36 2022] r8152 1-1.4:1.0 eth0: Tx status -2 [Mon Sep 26 18:44:39 2022] usb 1-1.4: reset high-speed USB device number 3 using dwc_otg

SkyBeam commented 1 year ago

Update, I have digged a few further and found this bug report. I also had "option interface-mtu 9000" set in my dhcp conf. Weirdly the Raspberry Pi did not accept it (all interfaces were set to MTU 1500) but somehow this is causing the interface to stall on jumbo packages being received.

Even more weird the bug was closed and a kernel patch seems to have been included but I am still getting this issue on latest builds with a way newer kernel. Perhaps the bug was re-introduced at some time.