raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.15k stars 5k forks source link

Jumbo frame support on Pi4 ethernet (Genet) #5561

Open 6by9 opened 1 year ago

6by9 commented 1 year ago

Describe the bug

From #5419 (closed PR) where someone had proposed a total hack to increase the max packet length.

Queried on net-dev https://www.spinics.net/lists/netdev/msg919558.html https://www.spinics.net/lists/netdev/msg925029.html So Florian is going to have a little look into it.

Opening new issue to track anything that comes back from it.

Steps to reproduce the behaviour

Current max packet size only allows standard ethernet 1500 byte MTU (+ VLAN headers).

Device (s)

Raspberry Pi 4 Mod. B, Raspberry Pi 400, Raspberry Pi CM4, Raspberry Pi CM4 Lite

System

All kernels to current (6.5)

Logs

No response

Additional context

No response

wtschueller commented 1 year ago

I reviewed the old records I compiled during migration from 5.4 to 5.10. Regarding the packet corruption, my memorization was not quite correct. Corruption occurred for udp packets larger than 4K, not 2K as probably said.

My first test is against kernel 5.10.17 and on the same hardware. I applied your first patch (larger maxmtu, default threshold). I compiled bcmgenet as a module and insmod the patched module at runtime.

The following code sends packets slightly above 4K to show the corruption problem.

# only temporarily increase mtu, avoids losing headless connection
sudo ip link set dev eth0 mtu 8000
sudo ethtool -K eth0 tx on
#
# wireshark receives this udp packet with 2 nonzero bytes at 0x1028
dd if=/dev/zero bs=4100 count=1 |netcat -u 169.254.1.1 65000 -q 1
#
# icmp is not affected
ping 169.254.1.1 -c 1 -M do -v -s 4100 -p 0
#
# now repeat without crc offload
sudo ethtool -K eth0 tx off
#
# wireshark receives this udp packet with all zeros as expected
dd if=/dev/zero bs=4100 count=1 |netcat -u 169.254.1.1 65000 -q 1
#
# icmp is not affected
ping 169.254.1.1 -c 1 -M do -v -s 4100 -p 0
#
# back to surely stable state
sudo ip link set dev eth0 mtu 1500

My stress test goes like this:

for ((MTU=8192; MTU>2000; ((MTU=MTU-4)) )) do sudo ifconfig eth0 mtu ${MTU} ; ifconfig eth0 ; dd if=/dev/zero bs=65536 count=10 |netcat -u 169.254.1.1 65000 -q 1; done

eth0 freezes as soon as the last packet of the fragments reach approx. 2K and I see the well known time-out with dmesg. I think I can trigger it even with equally sized packets, I just takes longer. So this one would be a blocker even for jumbo frame support up to 3840 bytes.

Next steps:

wtschueller commented 1 year ago

The freeze occurs also on my CM4. Steps to reproduce:

 sudo ip link set dev eth0 mtu 2036
 while (true) do dd if=/dev/zero bs=65536 count=1000 |netcat -u 169.254.1.1 65000 -q 0; done

Log message:

[Aug 4 15:03] ------------[ cut here ]------------
[  +0.000026] NETDEV WATCHDOG: eth0 (bcmgenet): transmit queue 0 timed out
[  +0.000067] WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x38c/0x394
[  +0.000004] Modules linked in: algif_hash aes_neon_bs aes_neon_blk crypto_simd cryptd algif_skcipher af_alg bnep hci_uart btbcm bluetooth ecdh_generic ecc 8021q garp stp llc brcmfmac brcmutil sha256_generic cfg80211 rfkill raspberrypi_hwmon bcm2835_unicam v4l2_dv_timings vc4 v3d rpivid_hevc(C) cec gpu_sched bcm2835_codec(C) bcm2835_v4l2(C) bcm2835_isp(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig drm_kms_helper videobuf2_memops videobuf2_v4l2 dummy_sensor videobuf2_common v4l2_fwnode i2c_mux_pinctrl snd_soc_core i2c_brcmstb i2c_mux snd_compress snd_pcm_dmaengine snd_bcm2835(C) snd_pcm snd_timer vc_sm_cma(C) i2c_bcm2835 snd videodev syscopyarea mc sysfillrect genet sysimgblt fb_sys_fops nvmem_rmem uio_pdrv_genirq uio i2c_dev fuse drm drm_panel_orientation_quirks backlight ip_tables x_tables ipv6
[  +0.000180] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G         C        5.10.17-v8+jf+ #26
[  +0.000004] Hardware name: Raspberry Pi Compute Module 4 Rev 1.1 (DT)
[  +0.000004] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[  +0.000005] pc : dev_watchdog+0x38c/0x394
[  +0.000004] lr : dev_watchdog+0x38c/0x394
[  +0.000002] sp : ffffffc011583ce0
[  +0.000004] x29: ffffffc011583ce0 x28: ffffff8103ef2580
[  +0.000007] x27: 0000000000000004 x26: ffffff8106400480
[  +0.000006] x25: 0000000000000140 x24: 00000000ffffffff
[  +0.000006] x23: ffffff81064003dc x22: 0000000000000003
[  +0.000013] x21: ffffffc011246000 x20: ffffff8106400000
[  +0.000011] x19: 0000000000000000 x18: 00000000fffffffe
[  +0.000010] x17: 00000110372503c0 x16: 000000000000001f
[  +0.000006] x15: 0000000000000020 x14: ffffffffffffffff
[  +0.000006] x13: ffffffc01142ac10 x12: ffffffc01142a863
[  +0.000006] x11: ffffff8100459240 x10: ffffffc0112bd378
[  +0.000006] x9 : ffffffc0100ed8c0 x8 : ffffffc011265378
[  +0.000006] x7 : ffffffc0112bd378 x6 : 0000000000000003
[  +0.000008] x5 : 0000000000000000 x4 : 0000000000000000
[  +0.000009] x3 : ffffffc01124a628 x2 : 0000000000000000
[  +0.000007] x1 : 0000000000000000 x0 : ffffff8100250000
[  +0.000006] Call trace:
[  +0.000005]  dev_watchdog+0x38c/0x394
[  +0.000006]  call_timer_fn+0x38/0x200
[  +0.000006]  __run_timers.part.0+0x230/0x300
[  +0.000005]  run_timer_softirq+0x54/0x90
[  +0.000007]  __do_softirq+0x1a8/0x544
[  +0.000006]  irq_exit+0xf0/0x104
[  +0.000007]  __handle_domain_irq+0xc0/0x13c
[  +0.000004]  gic_handle_irq+0x5c/0xf0
[  +0.000003]  el1_irq+0xc4/0x180
[  +0.000007]  arch_cpu_idle+0x18/0x30
[  +0.000005]  default_idle_call+0x40/0x1d4
[  +0.000011]  do_idle+0x260/0x2a4
[  +0.000004]  cpu_startup_entry+0x34/0x7c
[  +0.000005]  secondary_start_kernel+0x138/0x184
[  +0.000004] ---[ end trace f6b66bf422679008 ]---

Info

System Information
------------------

Raspberry Pi Compute Module 4 Rev 1.1
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"

Raspberry Pi reference 2023-02-21
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 25e2319effa91eb95edd9d9209eb9f8a584d67be, stage2

Linux raspberrypi 5.10.17-v8+jf+ #26 SMP PREEMPT Thu Aug 3 13:02:25 CEST 2023 aarch64 GNU/Linux
Revision: d03141
Serial: 10000000f5502b46
Model: Raspberry Pi Compute Module 4 Rev 1.1
Throttled flag  : throttled=0x0
Camera          : supported=0 detected=0, libcamera interfaces=0

Videocore information
---------------------

Jan  5 2023 10:46:54
Copyright (c) 2012 Broadcom
version 8ba17717fbcedd4c3b6d4bce7e50c7af4155cba9 (clean) (release) (start)

alloc failures:     0
compactions:        0
legacy block fails: 0

Could increasing TBUF_TXCHK_PKTRDY_THLD (name taken from https://github.com/raspberrypi/linux/pull/5419#issuecomment-1632736483) be of any help? What is its address?

wtschueller commented 1 year ago

I can confirm, that incoming jumbo packets are not fragmented into multiple buffers. This observation does not agree with my old records. I believe that I made mistakes interpreting the log messages in the past.

Using a similar patch like https://github.com/raspberrypi/linux/pull/5419#issuecomment-1633165459 and code like this

for ((SIZE=3700; SIZE<3900; ((SIZE=SIZE+1)) )) do ping 169.254.1.1 -c 1 -v -s ${SIZE}; done

I see the following log:

...
[  +0,005467] bcmgenet_desc_rx:p_ind=3299 c_ind=3298 read_ptr=226 len_stat=0x0f3c7f80 len= 3900 reg_len_stat=0x0f3c7f80 reg_len= 3900
[  +0,005369] bcmgenet_desc_rx:p_ind=3300 c_ind=3299 read_ptr=227 len_stat=0x0f3d7f80 len= 3901 reg_len_stat=0x0f3d7f80 reg_len= 3901
[  +0,004767] bcmgenet_desc_rx:p_ind=3301 c_ind=3300 read_ptr=228 len_stat=0x0f3e7f80 len= 3902 reg_len_stat=0x0f3e7f80 reg_len= 3902
[  +0,004952] bcmgenet_desc_rx:p_ind=3302 c_ind=3301 read_ptr=229 len_stat=0x0f3f7f80 len= 3903 reg_len_stat=0x0f3f7f80 reg_len= 3903
[  +0,006352] bcmgenet_desc_rx:p_ind=3303 c_ind=3302 read_ptr=230 len_stat=0x0f407f80 len= 3904 reg_len_stat=0x0f407f80 reg_len= 3904
[  +0,005886] bcmgenet_desc_rx:p_ind=3304 c_ind=3303 read_ptr=231 len_stat=0x0f403f80 len= 3904 reg_len_stat=0x0f817f80 reg_len= 3969 # I expect 3905 instead
[  +1,686197] bcmgenet_desc_rx:p_ind=3305 c_ind=3304 read_ptr=232 len_stat=0x0f403f80 len= 3904 reg_len_stat=0x0f827f80 reg_len= 3970 # I expect 3906 instead
...

Packet length from register is 64 bytes larger then I would expect in case of length exceeding threshold.

wtschueller commented 1 year ago

Well, I guess TBUF_PKT_RDY_THLD is at 0x10. This test triggers the time out/freeze at mtu = 2036:

for ((MTU=1500; MTU<=3840; ((MTU=MTU+4)))) do sudo ifconfig eth0 mtu ${MTU} ; ip -d link list eth0 ; dd if=/dev/zero bs=65536 count=1024 |netcat -u 169.254.1.1 65000 -q 0; done

The following patch pushes that limit by 0x70*16 = 1792 to 3828. So I can safely use mtu = 3824 now.

Date: Mon, 7 Aug 2023 15:15:18 +0200
Subject: net: bcm: genet: Set the TBUF_PKT_RDY_THLD register

Set the TBUF_PKT_RDY_THLD register to the max value that it can
take to allow for larger frames. Allows for mtu up to 3824.

Signed-off-by: wtschueller <wtschueller@users.noreply.github.com>

diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 9749d6678..4a6038b7c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -2502,6 +2502,7 @@ static void init_umac(struct bcmgenet_priv *priv)
     */
    reg = 0xf0;
    bcmgenet_rbuf_writel(priv, reg, RBUF_PKT_RDY_THLD);
+   bcmgenet_writel(reg, priv->base + priv->hw_params->tbuf_offset + TBUF_PKT_RDY_THLD);

    /* enable rx checksumming */
    reg = bcmgenet_rbuf_readl(priv, RBUF_CHK_CTRL);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.h b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
index 52068d8bf..d470364ca 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
@@ -283,6 +283,8 @@ struct bcmgenet_mib_counters {
 #define  TBUF_EEE_EN           (1 << 0)
 #define  TBUF_PM_EN            (1 << 1)

+#define TBUF_PKT_RDY_THLD      0x10
+
 #define TBUF_CTRL_V1           0x80
 #define TBUF_BP_MC_V1          0xA0
wtschueller commented 1 year ago

Same behaviour for kernel 5.15.61, RPi4 and CM4.

Simi4 commented 1 year ago

Any news?

cdalvaro commented 11 months ago

This feature would be great

6by9 commented 11 months ago

I've just pinged the thread on net-dev to see if Florian ever found any docs. I'm afraid it's a fairly low priority all round.