flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.
https://www.flatcar.org/
Apache License 2.0
748 stars 32 forks source link

Flatcar Container Linux fails and reboots: "kernel BUG at net/core/skbuff.c" #378

Closed seh closed 1 year ago

seh commented 3 years ago

Description

On AWS EC2 instances using Flatcar Container Linux versions 2765.1.0 and 2801.1.0 from the Beta channel, with a kOps-provisioned Kubernetes installation on top, we encounter a kernel bug that causes the machines to stop and reboot immediately.

The log entries in journalctl appear as follows:

Apr 06 17:59:41 ip-10-2-1-63.eu-west-1.compute.internal kernel: ------------[ cut here ]------------
Apr 06 17:59:41 ip-10-2-1-63.eu-west-1.compute.internal kernel: kernel BUG at net/core/skbuff.c:4008!
-- Boot 8314fb086d5b4ed0a9e80895ab0c4f0b --
Apr 06 17:59:59 localhost kernel: Linux version 5.10.25-flatcar (build@pony-truck.infra.kinvolk.io) (x86_64-cros-linux-gnu-gcc (Gentoo Hardened 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.35 p1) 2.35.0) #1 SMP Wed Mar 24 14:51:21 ->
lines 3257-3278

Sometimes the line number in file net/core/skbuff.c is 3,996 instead of 4,008. Usually we'll see 3,996 cited, then after the machine reboots, thereafter we'll see 4,008, suggesting that the rebooting swapped some updated files into place.

Note that we have locksmithd disabled, but update-engine is enabled, so we're downloading updates but not putting them into use eagerly.

Impact

Our fleet of Kubernetes cluster machines reboot periodically, causing the containers running on them to exit without warning and be replaced (in most cases) by the kubelet after a short delay.

Environment and steps to reproduce

  1. Set-up:
    • AWS EC2 in the "eu-west-1" region, though we've seen these a few of failures in the "us-east-2" region as well.
    • Instance types we've seen fail:
    • m5.xlarge
    • m5.2xlarge
    • m5.4xlarge
    • m5a.2xlarge
    • c5.xlarge
    • Cluster provisioned by kOps version 1.19.1
    • Kubernetes versions 1.19.8 and 1.19.9
    • Cluster CNI: Calico version 3.17.3 and 3.18.1
  2. Task:
    • Kubernetes is running either control plane or worker node responsibilities.
    • We have not seen this failure occur on bastion machines (instance type t3.micro) that don't run any Kubernetes components.
  3. Action(s):
    a. Launch an EC2 instance using Flatcar Container Linux, perhaps via a supervising ASG. b. Allow various Kubernetes components to start (e.g. kubelet, CNI daemons). c. Periodically check the machine's last boot time. d. Inspect system logs with a command like journalct --grep=skbuff.
  4. Error:
    The machine will hum along normally, downloading updates occasionally, and running containers for Kubernetes workload. With no warning, the machine will reboot. Subsequent inspection of the log via journalctl shows a message like this:
    kernel: kernel BUG at net/core/skbuff.c:3996!

    One variation:

    kernel: kernel BUG at net/core/skbuff.c:4008!

    After the machine boots, the /sys/fs/pstore directory mentioned here exists, but is empty. The "pstore" mount entry is as follows:

    pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)

    Perhaps our hardware does not support pstore, per the following uname -a output:

    Linux ip-10-2-1-63.eu-west-1.compute.internal 5.10.25-flatcar #1 SMP Wed Mar 24 14:51:21 -00 2021 x86_64 Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz GenuineIntel GNU/Linux

Expected behavior The machine should continue running normally without encountering errors that cause it to reboot without warning.

Additional information We run similar Kubernetes cluster in several other AWS regions:

We have not seen this failure occur in those regions. We see it predominantly in "eu-west-1" and occasionally in "us-east-2." That could be due to more intense workload in the clusters in the former region.

t-lo commented 3 years ago

Thank you for reporting @seh , we'll have a look. Do you think you could investigate into a reliable repro case for this issue?

seh commented 3 years ago

That is going to be very difficult, as so far it amounts to, "Run this Kubernetes cluster with this workload."

We have confirmed that downgrading to the Flatcar Container Linux beta version 2705.1.2 alleviates the problem. Again, versions 2765.1.0 and 2801.1.0 both suffer this same kernel bug.

Looking at the workload that runs on all the machines on which we've seen this occur, we identified only three in common:

We disabled Vector and proved that that was not the culprit. It wasn't feasible to disable "calico-node" and still have a functional Kubernetes cluster. (Swapping a CNI implementation in a production-grade cluster is a delicate operation.) We did not get as far as disabling Prometheus node exporter, though we're running it on every machine in several other Kubernetes clusters—that just happen to be less busy—so it's not likely it's at fault.

seh commented 3 years ago

I neglected to mention earlier that in our clusters where Flatcar Container Linux's locksmithd service is enabled, we don't see this bug arise. In our clusters where _updateengine is enabled but locksmithd is disabled, the bug occurs on 10-15 out of 200 machines every day.

t-lo commented 3 years ago

Interesting, thank you for sharing. While we're still looking for a solid repro the information you've provided will help with narrowing down the issue.

seh commented 3 years ago

I mentioned that on our machines where both _updateengine and locksmithd are enabled that we don't see this kernel bug arising. However, I did notice something odd in the system logs on those machines.

I've been polling our machines regularly via SSH, running a command like journalct --grep=skbuff and collecting the output, in order to see how often and on which machines the kernel bug has been occurring. On some of the machines with locksmithd enabled, I see output from that command like this:

journalctl --grep output ``` -- Journal begins at Sat 2021-02-13 23:16:07 UTC, ends at Mon 2021-04-12 21:28:47 UTC. -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- Boot bfbdc7106d5c45699668e7d2cd4ea8c5 -- -- Boot 1725d5160db347908f79b244ed63da5e -- -- No entries -- ```

Notice how it keeps flapping between two different IDs. What does that indicate?

margamanterola commented 3 years ago

Are these machines in-place-upgraded from CoreOS or were they freshly installed with Flatcar?

seh commented 3 years ago

These are fresh "installations" on EC2 instances by way of the published AMIs.

igcherkaev commented 3 years ago

Hello. We've noticed the same behavior here in our environment on VMWare provisioned VMs.

Few times a day a box which happens to be the busiest in terms of network load crashes with the following

May 19 18:30:45 rnqkbm401 kernel: kernel BUG at net/core/skbuff.c:4008!
-- Boot 757516b861de4db8a139aa895db71803 --
May 19 18:31:02 localhost kernel: Linux version 5.10.37-flatcar (build@pony-truck.infra.kinvolk.io) (x86_64-cros-linux-gnu-gcc (Gentoo Hardened 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.35 p1) 2.35.0) #1 SMP Mon May 17 22:08:55 -00 2021

Happens on 5.10.32-flatcar as well. pstore is empty too.

igcherkaev commented 3 years ago

Less loaded VMs (in terms of network I/O) even having same kernel don't crash. We also run kubernetes on them with calico as our CNI (with eBPF mode enabled).

igcherkaev commented 3 years ago

We have another Kubernetes cluster with the same setup, but it's still running older kernel (e.g., Flatcar Container Linux by Kinvolk 2605.10.0 (Oklo) 5.4.83-flatcar), and there's no crashes at all.

igcherkaev commented 3 years ago

Just a quick update: after disabling gso and gro on the box it hasn't crashed in 4 days already. We're monitoring the box, but it's already a great sign. It used to do it every other day or every day.

igcherkaev commented 3 years ago

Almost 7 days now without crashing since GSO/GRO got disabled.

seh commented 3 years ago

How did you disable those, Igor?

igcherkaev commented 3 years ago
ethtool -K <iface name> gso off
ethtool -K <iface name> gro off

where iface name is your NIC card, e.g. eth0.

We have a systemd unit now to disable it on boot.

jepio commented 2 years ago

Is this still occurring with the most recent releases? Could you also test alpha which has kernel 5.15, which might not trigger this any longer.

seh commented 2 years ago

We've seen it most recently two weeks ago with kernel version 5.10.84, which we received by way of an upgrade when rebooting one of our machines that started life with Flatcar version 2705.1.2.

It may be another couple of weeks before I can offer any testing outcome. What changed recently that you think may alleviate this problem?

jepio commented 2 years ago

Nothing @seh, I wast just hoping it might have resolved itself.

Would you be able to capture the full splat on the serial console, including the stacktrace?

seh commented 2 years ago

Next time I see it, I will grab all I can from journalctl. Is there another source you’re recommending that I collect as well?

pothos commented 2 years ago

In case your system has a pstore backend, you may find dmesg traces in /var/lib/systemd/pstore/ on the next boot. The files get moved there for persistent storage instead of staying in /sys/fs/pstore. I'll update the docs (Edit: done here https://github.com/flatcar-linux/flatcar-docs/pull/206).

seh commented 2 years ago

Note that I mentioned in my initial description that our pstore directory winds up empty after these reboots, perhaps for lack of hardware support.

pothos commented 2 years ago

Maybe, but it could be that systemd-pstore.service ran and moved them to /var/lib/systemd/pstore, that's what I wanted to hint on. Edit: check whether you have pstore support by looking if /sys/module/pstore/parameters/backend contains something else than (null)

jmcgrath207 commented 2 years ago

@igcherkaev

ethtool -K <iface name> gso off
ethtool -K <iface name> gro off

where iface name is your NIC card, e.g. eth0.

We have a systemd unit now to disable it on boot.

Is this still working for you?

seh commented 2 years ago

This is still happening to us with Flatcar Container Linux version 3227.1.1.

Mitsuwa commented 2 years ago

This still seems to be happening in 3033.3.5

worse so i do not seem to be able to do the workaround

$ sudo ethtool -K eth0 gso off
Cannot get device udp-fragmentation-offload settings: Operation not supported
Cannot get device udp-fragmentation-offload settings: Operation not supporte
seh commented 2 years ago

We suffered through this bug through the night and this morning, and have found the workaround suggested by @igcherkaev in https://github.com/flatcar/Flatcar/issues/378#issuecomment-847189739 is working acceptably, so long as we're using Flatcar Container Linux with a kernel at version 5.15 or so. We found that using that new of a kernel wasn't enough without disabling generic receive and segmentation offload, and disabling that offload wasn't enough without a new enough kernel. In particular, kernel version 5.10.137 as offered by the LTS 3033.3.5 release wasn't new enough.

Here are the two systemd units I wrote to ensure that we toggle the offload off.

disable-generic-receive-offload.service ``` [Unit] Description=Disable generic receive offload on primary Ethernet interface Wants=network-online.target [Service] Type=oneshot RemainAfterExit=yes ExecStart=ethtool --offload eth0 generic-receive-offload off ExecStop=ethtool --offload eth0 generic-receive-offload on ```
disable-generic-segmentation-offload.service ``` [Unit] Description=Disable generic segmentation offload on primary Ethernet interface Wants=network-online.target [Service] Type=oneshot RemainAfterExit=yes ExecStart=ethtool --offload eth0 generic-segmentation-offload off ExecStop=ethtool --offload eth0 generic-segmentation-offload on ```
seh commented 2 years ago

Using the stable Flatcar Container Linux version 3227.2.2 atop kernel version 5.15.63, we see this kernel bug occur in file net/core/skbuff.c on line 4219 when the expression list_skb->head_frag is false, due to that (bit)field being false.

If we disable GRO and GSO (we're not yet sure if it's crucial to disable both of these), we skirt this kernel bug, but the network performance suffers so drastically that we can't afford to run our workload like that.

pothos commented 2 years ago

Can we interact with upstream maintainers despite having no clear trace and did someone start that discussion? The source code link from the BUG at net/core/skbuff.c:123! messages and the workarounds may give some hints already.

seh commented 2 years ago

We noticed that when running with GRO and GSO enabled again, with MTU ratcheted down on the eth0 interface from the default 9,001 to 1,500, this time using Flatcar Container Linux beta version 3346.1.0 and kernel version 5.15.70 atop the "m5.4xlarge" EC2 instance type, we see a different problem arise: Instead of the kernel reporting through the BUG_ON macro and rebooting, it reports a hardware checksum failure, and keeps going, albeit with degraded network performance afterward.

Please see this log fragment for an example.

dmesg output ``` [ 6654.575206] calia524c310aed: Caught tx_queue_len zero misconfig [ 6802.827628] : hw csum failure [ 6802.828397] skb len=322 headroom=148 headlen=322 tailroom=3306 mac=(114,14) net=(128,20) trans=148 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x61202bc9 ip_summed=2 complete_sw=0 valid=0 level=0) hash(0x9c389c38 sw=0 l4=1) proto=0x0800 pkttype=0 iif=3 [ 6802.834186] skb headroom: 00000000: bb cc 52 15 6a f0 46 8d ef db ac 01 22 40 4a bb [ 6802.835623] skb headroom: 00000010: ed 41 14 b3 3d ef 2a 9c 41 4e 75 bb fa 30 b6 59 [ 6802.837068] skb headroom: 00000020: 7f 25 0b 98 22 71 76 ce d1 02 8d 94 ab 4b 9f 02 [ 6802.838501] skb headroom: 00000030: 4f 1f 51 6a b6 82 39 23 f7 09 a1 3d d0 07 00 00 [ 6802.839921] skb headroom: 00000040: 06 c0 b0 ea 75 8e 06 c3 48 fa de 48 08 00 45 00 [ 6802.841344] skb headroom: 00000050: 01 88 6e 89 00 00 40 11 21 62 0a 03 6d 62 0a 03 [ 6802.842748] skb headroom: 00000060: 68 12 aa 69 21 18 01 74 b4 2d 08 00 00 00 00 00 [ 6802.844160] skb headroom: 00000070: 01 00 c2 22 d9 0c d9 b7 ee ee ee ee ee ee 08 00 [ 6802.845828] skb headroom: 00000080: 45 00 01 56 35 84 40 00 3e 06 1d da 64 7a c1 cc [ 6802.847521] skb headroom: 00000090: 64 69 5d 94 [ 6802.848539] skb linear: 00000000: a2 0c 1f 90 77 b0 94 1e 52 91 97 33 80 18 00 46 [ 6802.850223] skb linear: 00000010: e3 b1 00 00 01 01 08 0a 44 fb 3d b6 df e9 e5 7d [ 6802.851920] skb linear: 00000020: 47 45 54 20 2f 6d 65 74 72 69 63 73 20 48 54 54 [ 6802.853618] skb linear: 00000030: 50 2f 31 2e 31 0d 0a 48 6f 73 74 3a 20 31 30 30 [ 6802.855307] skb linear: 00000040: 2e 31 30 35 2e 39 33 2e 31 34 38 3a 38 30 38 30 [ 6802.857001] skb linear: 00000050: 0d 0a 55 73 65 72 2d 41 67 65 6e 74 3a 20 50 72 [ 6802.858694] skb linear: 00000060: 6f 6d 65 74 68 65 75 73 2f 32 2e 33 39 2e 30 0d [ 6802.860381] skb linear: 00000070: 0a 41 63 63 65 70 74 3a 20 61 70 70 6c 69 63 61 [ 6802.862067] skb linear: 00000080: 74 69 6f 6e 2f 6f 70 65 6e 6d 65 74 72 69 63 73 [ 6802.863766] skb linear: 00000090: 2d 74 65 78 74 3b 76 65 72 73 69 6f 6e 3d 31 2e [ 6802.865456] skb linear: 000000a0: 30 2e 30 2c 61 70 70 6c 69 63 61 74 69 6f 6e 2f [ 6802.867154] skb linear: 000000b0: 6f 70 65 6e 6d 65 74 72 69 63 73 2d 74 65 78 74 [ 6802.868845] skb linear: 000000c0: 3b 76 65 72 73 69 6f 6e 3d 30 2e 30 2e 31 3b 71 [ 6802.870536] skb linear: 000000d0: 3d 30 2e 37 35 2c 74 65 78 74 2f 70 6c 61 69 6e [ 6802.872220] skb linear: 000000e0: 3b 76 65 72 73 69 6f 6e 3d 30 2e 30 2e 34 3b 71 [ 6802.873913] skb linear: 000000f0: 3d 30 2e 35 2c 2a 2f 2a 3b 71 3d 30 2e 31 0d 0a [ 6802.875601] skb linear: 00000100: 41 63 63 65 70 74 2d 45 6e 63 6f 64 69 6e 67 3a [ 6802.877291] skb linear: 00000110: 20 67 7a 69 70 0d 0a 58 2d 50 72 6f 6d 65 74 68 [ 6802.878977] skb linear: 00000120: 65 75 73 2d 53 63 72 61 70 65 2d 54 69 6d 65 6f [ 6802.880660] skb linear: 00000130: 75 74 2d 53 65 63 6f 6e 64 73 3a 20 31 30 0d 0a [ 6802.882338] skb linear: 00000140: 0d 0a [ 6802.883243] skb tailroom: 00000000: 77 c3 25 55 29 3f 50 f3 c8 99 73 dd 04 a8 41 a1 [ 6802.884917] skb tailroom: 00000010: 15 7e 77 fd 79 58 fe 58 29 20 97 35 ce 13 a1 22 [ 6802.886598] skb tailroom: 00000020: f1 c8 62 f7 3d 7b 8a b0 60 e0 f5 89 dc dc 6f 8e [ 6802.888289] skb tailroom: 00000030: cc 19 fd e3 66 1f 5f 6a fe db b8 46 1d 20 bf d3 [ 6802.889974] skb tailroom: 00000040: 5f ce c4 8c 48 b1 84 f5 38 39 8e 5e 95 61 55 f6 [ 6802.891699] skb tailroom: 00000050: f8 c5 75 a9 66 45 1e b8 d7 82 f7 f2 16 66 28 ea [ 6802.893380] skb tailroom: 00000060: 9f b8 5a dc 75 b6 27 f3 43 9c 5a 59 e3 f2 23 b7 [ 6802.895076] skb tailroom: 00000070: b2 21 cb 6e be d2 a6 8d 6d bf 9a 5e 8b e3 8d 35 [ 6802.896758] skb tailroom: 00000080: 48 36 12 72 76 84 10 e3 8e 5a a7 1c b9 53 53 7d [ 6802.898436] skb tailroom: 00000090: 81 db eb d8 8d c7 5a 94 d7 54 18 a4 2a 3c 91 ae [ 6802.900130] skb tailroom: 000000a0: 8d ac c5 5d c9 9c 97 f1 a0 2f 03 18 ac 3f dc 64 [ 6802.901816] skb tailroom: 000000b0: 01 7d 01 97 c4 3b 74 23 1b 10 9e cc d4 16 ba f0 [ 6802.903501] skb tailroom: 000000c0: 54 99 43 77 fe e9 38 25 d8 6b b8 dc 00 02 05 9c [ 6802.905184] skb tailroom: 000000d0: e6 49 45 6e ee b4 91 ee af 35 84 60 c5 44 82 bf [ 6802.906881] skb tailroom: 000000e0: 43 8b aa 44 b8 16 52 2f 57 2d 6b de b2 7f ad 98 [ 6802.908565] skb tailroom: 000000f0: 9f b8 aa 57 66 5b ac b5 f1 e0 07 3d 62 cc a5 8a [ 6802.910256] skb tailroom: 00000100: c5 9a 00 03 e0 2a 4e c8 6a df 91 ea 71 2b bf 04 [ 6802.911950] skb tailroom: 00000110: 32 03 24 47 14 b9 d9 e4 f2 7a 25 2e e8 9a 07 cb [ 6802.913638] skb tailroom: 00000120: e6 6d 10 fc bf d3 e6 93 0a 7d c5 cb bd 68 dd a0 [ 6802.915342] skb tailroom: 00000130: 24 32 b0 26 32 51 44 b1 5a fd 26 51 d3 51 83 29 [ 6802.917019] skb tailroom: 00000140: 37 96 8c 17 62 f5 9b 5c d6 bb 20 0e c9 e5 0e 3a [ 6802.918715] skb tailroom: 00000150: 65 92 99 a8 dd 94 4e a2 9f 37 f5 09 ee b6 46 66 [ 6802.920400] skb tailroom: 00000160: 05 b4 ad 4b 5d b4 d0 e6 b0 29 e2 86 d5 be 8a 28 [ 6802.922085] skb tailroom: 00000170: aa e4 57 bb 8b ef 89 eb 80 84 8b e2 45 f9 32 59 [ 6802.923796] skb tailroom: 00000180: 0f bf 31 fb 32 1f 52 34 e5 17 c9 50 93 67 52 a4 [ 6802.925469] skb tailroom: 00000190: 9b 8a 07 df 45 46 35 9a 9f a1 43 a7 37 4c e9 f2 [ 6802.927150] skb tailroom: 000001a0: 2a cb 43 8f aa 61 fa d0 03 7e 7b 27 10 c3 d4 ef [ 6802.928681] skb tailroom: 000001b0: d7 3d 3e 58 a5 c0 85 c1 50 65 e7 74 8c 76 45 e7 [ 6802.930044] skb tailroom: 000001c0: 25 1e c4 76 a2 06 e2 b3 9f 4a 20 4f c4 f7 98 db [ 6802.931419] skb tailroom: 000001d0: 35 9c 51 bb 21 a6 06 cc a7 2f 5f 00 20 93 c0 3c [ 6802.932803] skb tailroom: 000001e0: a9 72 2f af a2 7e 87 53 97 e3 ba 24 ff b5 dd ed [ 6802.934158] skb tailroom: 000001f0: 43 dd 18 0e 89 e4 e5 24 3d 84 19 a8 67 0b bd d7 [ 6802.935540] skb tailroom: 00000200: 88 ab e1 37 67 f6 de 20 25 c4 a7 23 24 fd b3 af [ 6802.936904] skb tailroom: 00000210: 59 7c ce b2 25 47 6e 05 e2 db ff 7d 6e 27 e2 10 [ 6802.938275] skb tailroom: 00000220: 80 37 1d 95 98 22 3e 87 ba b0 c0 aa 9a ce 4c b8 [ 6802.939640] skb tailroom: 00000230: e7 fb 55 d3 69 7b 1c a5 bc d0 c4 a8 14 ab fd cd [ 6802.940998] skb tailroom: 00000240: bc d7 9d cd a8 ee 62 0e 44 81 f1 39 c6 4c e5 34 [ 6802.942371] skb tailroom: 00000250: 81 42 a3 22 04 06 aa 06 97 64 37 78 34 bc 29 c1 [ 6802.943733] skb tailroom: 00000260: 36 2a c3 5c ea 26 8a 6f 5c ff f9 f0 f4 37 dc 9b [ 6802.945106] skb tailroom: 00000270: 54 15 08 b7 86 8d 3e dc 1d 38 73 b9 4f 16 12 52 [ 6802.946463] skb tailroom: 00000280: 40 d7 91 a8 e4 3f ab 4a 20 09 6a ff cf 54 16 6b [ 6802.947828] skb tailroom: 00000290: fe d5 8f 8e 7e 8c 13 47 09 a5 f2 5a 59 c7 3f ee [ 6802.949183] skb tailroom: 000002a0: 0d f3 69 eb 52 c3 05 e1 6b d4 20 37 27 4b 65 82 [ 6802.950544] skb tailroom: 000002b0: a9 8d bd 54 a3 08 7f 2c 39 d0 c8 58 7f 8b 52 7d [ 6802.951899] skb tailroom: 000002c0: d6 8e ef ec 4f 98 2a c9 40 61 e5 ce 6a b8 80 d4 [ 6802.953277] skb tailroom: 000002d0: a5 71 bb 6d 44 b1 09 a8 1d c5 83 01 92 43 9a fe [ 6802.954631] skb tailroom: 000002e0: 79 e2 23 b8 02 ae ff 6d 57 04 7e 72 b9 7b 40 93 [ 6802.956005] skb tailroom: 000002f0: 27 52 db 6f fe 74 5d 92 53 bb f2 31 2f 4c 44 e9 [ 6802.957431] skb tailroom: 00000300: 69 46 e6 a7 a1 c1 af 47 77 9f 3a 19 0e f5 03 82 [ 6802.959112] skb tailroom: 00000310: b0 28 85 88 f6 56 fa 36 59 88 3d 66 89 d6 cc b8 [ 6802.960796] skb tailroom: 00000320: 4b de d6 12 66 2d 7b 4c f8 f4 7b 29 ba ef 91 db [ 6802.962482] skb tailroom: 00000330: 48 52 e3 99 fe 70 c9 24 d2 75 dc 2b 2a 40 d4 96 [ 6802.964161] skb tailroom: 00000340: cd e8 ce 62 8d 38 04 1f ce b9 4e b1 bc 85 82 57 [ 6802.965838] skb tailroom: 00000350: 8b aa e1 72 ed a8 cc 10 ce 24 df 15 21 36 73 57 [ 6802.967522] skb tailroom: 00000360: 98 ec 22 49 f2 e9 02 4b 0a e4 b2 bc a8 bf 9c 63 [ 6802.969216] skb tailroom: 00000370: b9 f6 81 b8 18 c8 8a 6d 02 b0 14 ef d2 28 c6 0a [ 6802.970892] skb tailroom: 00000380: 8a 14 69 54 24 e7 32 f8 78 4c 8e 44 e8 21 3a 78 [ 6802.972577] skb tailroom: 00000390: 85 97 d2 cd ae 65 19 d0 80 24 35 c8 e4 58 f7 83 [ 6802.974277] skb tailroom: 000003a0: e9 25 a1 1b c7 8d dd 36 f8 6f 7c 87 6a 8d 4a d8 [ 6802.975977] skb tailroom: 000003b0: 44 91 ca 5f ef 99 97 bf ee 56 7d 23 22 9a 9d 39 [ 6802.977656] skb tailroom: 000003c0: 62 ed cf 03 06 45 e2 de 42 59 1f d5 4e bc 46 3c [ 6802.979343] skb tailroom: 000003d0: c3 19 e2 1c fd 8f f6 a0 9f b2 b3 e2 fc 95 93 e7 [ 6802.981033] skb tailroom: 000003e0: f7 6f c3 9b ff 6b d6 93 74 0c 3c f7 64 5b 70 1b [ 6802.982727] skb tailroom: 000003f0: 86 43 46 72 5c 2f 19 69 39 0e d6 9c ba f2 21 c4 [ 6802.984427] skb tailroom: 00000400: 34 a5 8e 5c cf 82 03 5b 52 dc 50 9c b0 86 b2 92 [ 6802.986112] skb tailroom: 00000410: d7 fd f1 27 85 a6 e8 b9 26 70 e2 e5 19 8f b6 5f [ 6802.987820] skb tailroom: 00000420: 20 1f bb 26 0f c9 a6 15 46 06 3d 26 a9 60 6f b6 [ 6802.989524] skb tailroom: 00000430: 64 ff 25 34 22 fc a5 66 84 f6 6d 03 c6 8a 92 10 [ 6802.991230] skb tailroom: 00000440: a2 2e 2d a6 62 6e 19 57 35 f7 25 3b 0e 85 5d e0 [ 6802.992931] skb tailroom: 00000450: f8 77 04 32 84 eb 42 da 6c d4 bb 3b 89 65 74 2a [ 6802.994617] skb tailroom: 00000460: 5d 6d 49 f5 64 7a 29 fd 30 16 a3 ed 94 1e 4a f6 [ 6802.996313] skb tailroom: 00000470: fe ce 21 6c 5f 1b f7 58 5a bf 11 2f 56 85 f3 db [ 6802.998002] skb tailroom: 00000480: 82 77 c5 19 e2 8e 28 09 aa c7 8b 8f fb 92 aa 28 [ 6802.999708] skb tailroom: 00000490: 05 1f 2e b5 eb 42 e6 1e 4c 56 ca 4f 42 32 36 2c [ 6803.001436] skb tailroom: 000004a0: 2b b3 1b d6 df 0f a8 cc 55 29 29 ae d4 b3 1b 62 [ 6803.003127] skb tailroom: 000004b0: a6 aa b6 ff 77 f4 4b 6b cd c1 3a 88 49 0e fd 39 [ 6803.004817] skb tailroom: 000004c0: 4a d1 30 ab 22 be 7a 65 4c f1 b7 bc 49 86 ed d9 [ 6803.006511] skb tailroom: 000004d0: 52 ed a5 51 7f d0 00 51 78 e9 4a 1f a3 c1 4e 5c [ 6803.008214] skb tailroom: 000004e0: ff 6d 25 cf d0 15 44 c2 f7 4b bb 4f c8 d3 fd 89 [ 6803.009896] skb tailroom: 000004f0: c7 1d 76 c5 6f dc 6a 40 d1 a5 ad d7 f2 95 5c 1b [ 6803.011594] skb tailroom: 00000500: 0f cf 42 21 38 9d a3 e2 26 a2 54 d1 12 f8 92 1f [ 6803.013288] skb tailroom: 00000510: b7 04 15 26 a9 ec 7d d4 65 72 f6 18 63 7b 4d a6 [ 6803.014981] skb tailroom: 00000520: 06 7b 7f 2e 43 b2 da 4c 55 55 3f c6 c7 b4 37 6b [ 6803.016668] skb tailroom: 00000530: d5 09 ff b7 bc 7f a2 9e 1a 49 35 98 cb 19 41 e8 [ 6803.018364] skb tailroom: 00000540: 74 21 2d 94 38 fc 3b 78 15 ec 05 91 c5 aa 8f e0 [ 6803.020056] skb tailroom: 00000550: 0e 41 26 1d bb 6c 59 88 9d 15 30 78 17 32 73 1c [ 6803.021745] skb tailroom: 00000560: e3 be d2 3f a7 a4 de 06 2f 88 86 0e 70 f1 ae 67 [ 6803.023449] skb tailroom: 00000570: a0 d0 cd b6 be be af 2a a7 df db 8e ae 71 b1 fb [ 6803.025139] skb tailroom: 00000580: 5c 87 aa cd d8 8f 26 f2 63 52 ff 2b 48 6f 70 bd [ 6803.026831] skb tailroom: 00000590: ec 99 f8 49 45 2f 94 1f 68 54 08 3d f7 c4 e2 0f [ 6803.028522] skb tailroom: 000005a0: 1e 02 2c 21 3b d0 93 f1 3a e1 5d df df ef 84 86 [ 6803.030205] skb tailroom: 000005b0: 00 fc a9 72 4f 56 f9 f8 bf 5e 14 d8 8c 1a af 3d [ 6803.031894] skb tailroom: 000005c0: 1b 1c 4a 4d dc 48 ef 65 2e 74 c0 63 35 25 61 87 [ 6803.033573] skb tailroom: 000005d0: 48 00 78 bc 31 b4 fa dc e6 c8 c1 aa 37 e2 8a 38 [ 6803.035215] skb tailroom: 000005e0: 19 7a 42 23 24 f1 d7 f5 72 1d 6f 08 d8 28 7c 43 [ 6803.036593] skb tailroom: 000005f0: b3 e6 aa 4b d2 af 66 8f 45 46 cd a2 fa 20 4b 08 [ 6803.037971] skb tailroom: 00000600: 7a dd 90 7f 94 11 b6 b9 60 96 58 4d bf 17 05 31 [ 6803.039347] skb tailroom: 00000610: be f2 57 1d da f0 21 a9 27 70 88 50 cc 2e cd 64 [ 6803.040713] skb tailroom: 00000620: ed a5 75 40 21 80 f2 64 e5 d4 ae ed 90 e7 1e bb [ 6803.042070] skb tailroom: 00000630: ca 8b 8c 37 32 07 5b 2e b9 eb 79 23 aa a8 eb 2c [ 6803.043441] skb tailroom: 00000640: f6 ef e0 8d e2 dd 6b 13 af f6 51 69 f2 fe 92 19 [ 6803.044806] skb tailroom: 00000650: 42 81 8c 21 08 75 e0 de f5 93 9a 74 30 68 b4 86 [ 6803.046167] skb tailroom: 00000660: db df b5 3d ce f0 6a 67 52 a3 34 6f e9 b4 bc cd [ 6803.047546] skb tailroom: 00000670: 81 8c ac a5 f5 9b 79 44 b5 3e 7f 3e 86 47 94 17 [ 6803.048897] skb tailroom: 00000680: ca 18 de 36 52 49 1e d2 78 a1 4d 86 e4 bf 5b 1f [ 6803.050256] skb tailroom: 00000690: 5b 91 e6 2b d9 fd 4a 53 55 7a 51 d6 1b 54 d2 ea [ 6803.051640] skb tailroom: 000006a0: a5 17 e4 63 5d 54 03 d0 be 1c fb 2e 9e c8 6c 91 [ 6803.052974] skb tailroom: 000006b0: d0 81 97 b4 85 9c 0d 54 88 e6 50 8d 33 4d e1 23 [ 6803.054353] skb tailroom: 000006c0: d6 7d 24 57 20 07 5b be bc 7e f1 20 0c ce b5 7f [ 6803.055737] skb tailroom: 000006d0: 4c 02 ce 4d 1c 74 72 dd ca bc 6b 79 03 d4 56 31 [ 6803.057122] skb tailroom: 000006e0: 1b 1f 8f 07 50 e1 d1 e3 e5 8b 3d a7 39 ad 58 c5 [ 6803.058785] skb tailroom: 000006f0: 6f 5d 98 0c ba 7d d3 ee 88 ca 06 50 ea 42 4f ef [ 6803.060506] skb tailroom: 00000700: a7 66 4f 58 9f 13 43 a8 64 55 bb be 49 ac 0b b3 [ 6803.062188] skb tailroom: 00000710: ab d8 13 ef 09 b3 5e e8 58 ed 63 85 cc a9 a8 e3 [ 6803.063887] skb tailroom: 00000720: d6 af bd 1d 21 20 41 75 63 e4 e2 58 43 a8 a6 23 [ 6803.065594] skb tailroom: 00000730: b0 10 6b fb c5 61 de 19 91 98 c8 3c 5e 4a 9b eb [ 6803.067286] skb tailroom: 00000740: 07 a4 3d 35 45 b1 b6 d2 93 b0 f7 ab 61 fe f1 34 [ 6803.068996] skb tailroom: 00000750: ec b6 f4 72 38 eb 4a 98 8a 8e 8f 17 b8 ca 03 b9 [ 6803.070698] skb tailroom: 00000760: e3 a4 4e 19 93 e8 35 16 88 c6 69 c3 9a 32 ff ce [ 6803.072415] skb tailroom: 00000770: 60 41 14 d4 86 94 28 ce 7a a5 51 c1 6b 8a a6 b0 [ 6803.074125] skb tailroom: 00000780: 41 ea d2 32 ce cc 06 14 12 31 7d 5c 44 94 7e 9a [ 6803.075821] skb tailroom: 00000790: 6d ce 50 25 51 c8 77 f3 9d a4 79 8c 9d 28 db 4c [ 6803.077506] skb tailroom: 000007a0: 76 9a d5 f4 e5 27 c1 b5 3c 9a df 60 44 93 22 3e [ 6803.079187] skb tailroom: 000007b0: 67 3a bd f1 91 bb 19 55 28 09 25 ab f9 26 43 62 [ 6803.080885] skb tailroom: 000007c0: ef 3c 09 6b 6c 89 25 14 b3 ac c8 af 72 11 96 6c [ 6803.082568] skb tailroom: 000007d0: 38 45 69 ac fe 57 63 8d c9 ee ad dd cb 7c be 88 [ 6803.084258] skb tailroom: 000007e0: 68 8f c3 23 af 81 08 b7 16 0a 16 a7 40 65 d1 86 [ 6803.085939] skb tailroom: 000007f0: 9b ca c3 44 59 a7 76 90 62 f6 3b 51 a7 54 f8 0f [ 6803.087631] skb tailroom: 00000800: 74 71 30 d6 95 b6 b2 be fe dd 0a 4a 23 07 b8 e2 [ 6803.089314] skb tailroom: 00000810: 46 ef 1a 65 60 14 6d 59 d0 61 52 6a b1 d2 7c 40 [ 6803.091004] skb tailroom: 00000820: b3 89 7d ec 9b 20 83 5c ba f1 d7 96 91 4f 90 67 [ 6803.092689] skb tailroom: 00000830: 7c a3 7e a0 2e c7 a1 a3 97 50 57 76 21 07 94 e2 [ 6803.094371] skb tailroom: 00000840: 20 b2 bf 59 14 7d f5 18 8b 80 be 24 74 6e 47 81 [ 6803.096060] skb tailroom: 00000850: 6f 6a 15 78 dc f9 bb f9 3a d0 a3 6c d1 be 25 b3 [ 6803.097739] skb tailroom: 00000860: ee 7f 98 67 8e e6 43 da 50 83 bd 4a c7 f5 42 c9 [ 6803.099428] skb tailroom: 00000870: 8f d0 30 be 8b f4 c9 89 67 46 de 1d 5b 90 dc 84 [ 6803.101118] skb tailroom: 00000880: c3 33 86 21 b7 06 c6 0d 87 b9 9a b6 d0 bb 92 93 [ 6803.102804] skb tailroom: 00000890: 69 92 b0 b6 de 9c e2 e0 f1 09 27 f0 f3 5d d5 6a [ 6803.104506] skb tailroom: 000008a0: bb 8b 45 57 9e 1d 0e e6 75 00 11 31 c2 35 6c d9 [ 6803.106195] skb tailroom: 000008b0: e8 96 66 2d d6 1b 5a 07 f5 d7 c0 6d fd 97 76 31 [ 6803.107885] skb tailroom: 000008c0: 63 e5 f9 a3 f5 89 86 bc 40 4b a5 da 8f 4a 50 a1 [ 6803.109563] skb tailroom: 000008d0: 2b 76 e3 78 08 90 73 58 45 fa 5e a2 c2 a2 4b fb [ 6803.111282] skb tailroom: 000008e0: f2 cb 6e d3 76 ea b8 c8 6c 36 5d 7d c6 c5 f2 7f [ 6803.112976] skb tailroom: 000008f0: ab 58 4f 2a 8d f7 43 16 94 15 46 bf dc 8f 23 1f [ 6803.114662] skb tailroom: 00000900: 2a 58 83 7a ba e4 25 5f 4d dd 88 4f b6 b5 88 f3 [ 6803.116344] skb tailroom: 00000910: 8d 51 2c e5 61 d8 aa e8 a9 22 32 95 68 dc 17 fb [ 6803.117990] skb tailroom: 00000920: bf 24 04 c4 63 3b 30 1d cf c6 6e b6 05 a8 36 e1 [ 6803.119664] skb tailroom: 00000930: 23 e1 56 2c 55 72 cc 0c ac 46 4e 9d 67 18 9c 9d [ 6803.121349] skb tailroom: 00000940: 93 e0 2a fc 17 fd 0d 79 63 fa 3f e6 ee 27 d4 4c [ 6803.123045] skb tailroom: 00000950: 44 31 63 e3 92 f2 a3 52 43 a6 a9 10 c0 cb d7 40 [ 6803.124730] skb tailroom: 00000960: e8 34 64 2d ff a9 f8 02 b3 9a fd 73 32 a5 d0 9c [ 6803.126420] skb tailroom: 00000970: e0 da 58 7b d6 9c 1e f7 95 d9 ba 2d 52 60 40 f9 [ 6803.128109] skb tailroom: 00000980: 1d 33 f4 e2 0d 88 17 0f 2f c3 d7 a3 31 33 5e e3 [ 6803.129796] skb tailroom: 00000990: 3f 33 15 e0 06 22 45 7c 4c 4e 2f 0e 49 e1 a4 79 [ 6803.131480] skb tailroom: 000009a0: b2 a2 13 a8 41 e5 e8 a1 f3 bc 76 b3 f7 15 6a f0 [ 6803.133167] skb tailroom: 000009b0: 46 9b b8 b8 cc 01 22 40 c5 ea 86 2a 33 bd 09 51 [ 6803.134847] skb tailroom: 000009c0: c4 2a a1 8f 90 51 86 01 a1 f0 af 2a 47 22 59 59 [ 6803.136519] skb tailroom: 000009d0: 72 d1 cb 85 9f a5 1b 37 15 de 7b f2 90 67 f8 ce [ 6803.138199] skb tailroom: 000009e0: dd d2 0a 98 11 18 4c 16 53 70 36 11 c2 f4 42 e7 [ 6803.139875] skb tailroom: 000009f0: d7 ef bd e4 02 82 66 0d 09 d4 4c 0c 56 2e af 82 [ 6803.141550] skb tailroom: 00000a00: 47 39 ac 8f 99 9c 93 b5 a0 1b e7 d5 8a 66 b1 15 [ 6803.143231] skb tailroom: 00000a10: 6a f0 46 ed fa f8 e0 01 22 40 d8 2e 2a d5 c8 cd [ 6803.144908] skb tailroom: 00000a20: a6 98 40 19 bb 38 fb c8 ec a6 e1 7f 24 24 b0 f3 [ 6803.146596] skb tailroom: 00000a30: fd 17 53 2b 20 52 2f aa e7 88 e1 96 7c 64 ea 6e [ 6803.148274] skb tailroom: 00000a40: 3c 67 96 b2 0b 64 77 2f c2 14 aa ef 0b 77 8a 7c [ 6803.149952] skb tailroom: 00000a50: ec 4d a5 c2 86 fd 06 2e 33 00 09 6a 4c 15 fe c1 [ 6803.151634] skb tailroom: 00000a60: 04 16 e3 59 cc 1d db 42 4c 69 16 6b 26 71 f2 51 [ 6803.153309] skb tailroom: 00000a70: 48 15 6a 08 a0 c8 a9 a1 61 f0 3e 3a d7 0b 21 29 [ 6803.154992] skb tailroom: 00000a80: db ec 7b 0b 26 a5 4b fc de f9 a3 97 92 ef 2c be [ 6803.156667] skb tailroom: 00000a90: 2e 57 17 4b b4 5a 26 5d bd a8 b0 09 5f f3 ba 85 [ 6803.158350] skb tailroom: 00000aa0: bf 66 38 b5 5b 18 a3 59 6b 78 88 de 45 46 36 75 [ 6803.160029] skb tailroom: 00000ab0: 68 b6 cf 95 b5 1d 3d c5 87 42 8d 24 4c 80 0f f4 [ 6803.161703] skb tailroom: 00000ac0: 78 97 d1 9d cd f7 e2 0c f3 08 c9 93 e3 04 e6 ea [ 6803.163385] skb tailroom: 00000ad0: 93 15 6a f0 46 a7 ba f1 be 01 22 40 53 42 80 38 [ 6803.165085] skb tailroom: 00000ae0: 72 ba bf 90 9f 3b 79 6c b1 7c a9 72 ff ba 36 b6 [ 6803.166772] skb tailroom: 00000af0: 3d 09 a4 74 02 23 a1 ff 2f b0 86 01 b6 3a b0 78 [ 6803.168450] skb tailroom: 00000b00: 27 2f b9 6f 94 2c fc 4f 53 d8 5e a7 f7 49 32 25 [ 6803.170128] skb tailroom: 00000b10: f2 26 8a 0a ab 81 14 72 fb c1 3d 02 09 d4 4c 0c [ 6803.171805] skb tailroom: 00000b20: 55 c1 8d 9c 66 89 b8 cd 6f 77 5d d7 ec 46 33 1f [ 6803.173480] skb tailroom: 00000b30: f6 62 e2 15 6a f0 46 84 ef c8 eb 01 22 40 59 8a [ 6803.175159] skb tailroom: 00000b40: 55 e6 2b c5 dc ce 51 21 62 bc 7b 7f 17 20 89 b6 [ 6803.176830] skb tailroom: 00000b50: fd 28 4f 3f 36 b9 eb 17 ce 3b 8d 75 05 bc 62 40 [ 6803.178510] skb tailroom: 00000b60: 93 15 ac 4e ec 53 d2 13 8f 19 81 72 e0 24 4f 51 [ 6803.180185] skb tailroom: 00000b70: e0 3f b9 a5 2f 8c c1 9b dc 0d 94 dc 7e 09 09 6a [ 6803.181861] skb tailroom: 00000b80: 4c f6 78 3d 8f b3 0e 28 30 81 c1 63 98 29 3f 48 [ 6803.183543] skb tailroom: 00000b90: 2d ca 0e 91 2d 15 6a 08 8f de f0 41 e6 f0 3e 7c [ 6803.185217] skb tailroom: 00000ba0: 7f 42 3d b2 ff fa b2 99 0f 41 38 5e bd 7f 78 5f [ 6803.186897] skb tailroom: 00000bb0: fe bc 4c ac 04 56 5e 62 8a 83 a8 a0 ff 2a 29 49 [ 6803.188579] skb tailroom: 00000bc0: dc e6 61 8c 80 a7 63 de ea aa 77 95 88 17 5f 5a [ 6803.190255] skb tailroom: 00000bd0: 45 c2 3a c3 66 e9 b2 59 a3 3c ba d1 4f 7d ad cc [ 6803.191934] skb tailroom: 00000be0: 48 7c 5a a8 7e 52 03 c6 6e a3 5c 64 26 2f 57 6e [ 6803.193620] skb tailroom: 00000bf0: dd 29 ba d9 26 02 18 f0 46 e2 e1 ef bc 01 22 40 [ 6803.195307] skb tailroom: 00000c00: 22 c0 fb ba 61 e5 7d 52 e4 1a ee 05 47 c0 de 56 [ 6803.196980] skb tailroom: 00000c10: 02 f6 4c f3 c1 d2 50 6c 94 64 f5 73 64 ed 44 b8 [ 6803.198658] skb tailroom: 00000c20: ae ed 48 e2 50 ee 5b d9 51 00 1e 5a 09 17 58 86 [ 6803.200331] skb tailroom: 00000c30: 9f 03 24 63 73 6f ef e9 fa e9 13 29 29 b6 f3 01 [ 6803.202009] skb tailroom: 00000c40: 09 d4 4c ce fe 7d 65 4b 52 3d ea 2a 9e d7 18 a5 [ 6803.203688] skb tailroom: 00000c50: 91 12 6c 74 17 16 89 15 d4 f0 46 ff de f9 9f 01 [ 6803.205361] skb tailroom: 00000c60: 22 40 8a 0f b9 68 83 2d c4 7c 9a 23 59 b2 d6 ed [ 6803.207039] skb tailroom: 00000c70: 5b 8f 38 ee e0 fb 66 98 86 b4 5a 2d a3 05 8c 37 [ 6803.208712] skb tailroom: 00000c80: c5 57 80 d6 d4 87 17 61 19 1d ed 7f 44 d1 4b 77 [ 6803.210387] skb tailroom: 00000c90: af 42 07 dd b3 ce 55 b2 f7 a0 90 82 49 ba 05 95 [ 6803.212094] skb tailroom: 00000ca0: ac 04 09 6a 48 9f 8e c2 ef 58 1c e2 56 30 c8 19 [ 6803.213792] skb tailroom: 00000cb0: f1 9b 54 84 03 9e 74 8d f9 0a f0 46 bc c3 c6 dc [ 6803.215498] skb tailroom: 00000cc0: 02 22 40 e7 de d1 f2 73 af 60 5d d7 c5 52 28 94 [ 6803.217204] skb tailroom: 00000cd0: be 47 a1 36 9f a2 94 73 dc 68 c2 90 e5 21 4c ef [ 6803.218909] skb tailroom: 00000ce0: 9e a1 0b 66 a4 6d 62 97 95 2c [ 6803.220290] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.63-flatcar #1 [ 6803.221783] Hardware name: Amazon EC2 m5.4xlarge/, BIOS 1.0 10/16/2017 [ 6803.223242] Call Trace: [ 6803.223810] [ 6803.224286] dump_stack_lvl+0x46/0x5e [ 6803.225128] __skb_checksum_complete+0xdd/0xf0 [ 6803.226135] ? csum_block_add_ext+0x20/0x20 [ 6803.227085] ? reqsk_fastopen_remove+0x190/0x190 [ 6803.228121] tcp_rcv_established+0x496/0x6c0 [ 6803.229079] tcp_v4_do_rcv+0x148/0x240 [ 6803.229919] tcp_v4_rcv+0xdb8/0xf00 [ 6803.230707] ? ip_rcv_finish_core.constprop.0+0x141/0x420 [ 6803.231899] ip_protocol_deliver_rcu+0x33/0x200 [ 6803.232906] ip_local_deliver_finish+0x44/0x60 [ 6803.233897] __netif_receive_skb_one_core+0x8b/0xa0 [ 6803.234996] process_backlog+0x96/0x160 [ 6803.235865] __napi_poll+0x2a/0x150 [ 6803.236653] net_rx_action+0x250/0x2a0 [ 6803.237501] __do_softirq+0xcf/0x286 [ 6803.238310] irq_exit_rcu+0x99/0xc0 [ 6803.239107] common_interrupt+0x80/0xa0 [ 6803.239970] [ 6803.240457] [ 6803.240944] asm_common_interrupt+0x21/0x40 [ 6803.241885] RIP: 0010:native_safe_halt+0xb/0x10 [ 6803.242918] Code: 00 f0 80 48 02 20 48 8b 00 a8 08 75 c0 e9 7a ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 89 24 58 00 fb f4 cc cc cc cc 66 90 0f 00 2d 79 24 58 00 f4 c3 cc cc cc cc cc 0f [ 6803.247014] RSP: 0018:ffffffff87c03e38 EFLAGS: 00000246 [ 6803.248177] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 00000000ffffffff [ 6803.249749] RDX: ffff8b5d8da00000 RSI: ffff8b4ec1e9d000 RDI: ffff8b4ec1362000 [ 6803.251312] RBP: ffff8b4ec1e9d064 R08: ffffffff87dc0840 R09: 0000062fe041fc98 [ 6803.252878] R10: 00000000000000d6 R11: 0000000000000ed4 R12: 0000000000000001 [ 6803.254436] R13: ffffffff87dc08c0 R14: 0000000000000001 R15: 0000000000000000 [ 6803.256003] acpi_safe_halt+0x1f/0x30 [ 6803.256825] acpi_idle_enter+0xde/0x120 [ 6803.257682] cpuidle_enter_state+0x89/0x350 [ 6803.258621] cpuidle_enter+0x29/0x40 [ 6803.259418] do_idle+0x1e9/0x280 [ 6803.260144] cpu_startup_entry+0x19/0x20 [ 6803.261019] start_kernel+0x691/0x6ba [ 6803.261839] secondary_startup_64_no_verify+0xc2/0xcb [ 6803.262970] [ 7074.485025] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 7074.486521] IPv6: ADDRCONF(NETDEV_CHANGE): cali320fca7df79: link becomes ready [ 7074.604752] cali320fca7df79: Caught tx_queue_len zero misconfig [ 7134.429044] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 7134.430313] IPv6: ADDRCONF(NETDEV_CHANGE): calib15a0690693: link becomes ready [ 7134.546039] calib15a0690693: Caught tx_queue_len zero misconfig [ 7194.464536] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 7194.465876] IPv6: ADDRCONF(NETDEV_CHANGE): cali9a841390603: link becomes ready [ 7194.581372] cali9a841390603: Caught tx_queue_len zero misconfig [ 7254.443045] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 7254.444238] IPv6: ADDRCONF(NETDEV_CHANGE): califfbbb29779b: link becomes ready [ 7254.571444] califfbbb29779b: Caught tx_queue_len zero misconfig [ 7264.736871] pci 0000:00:1e.0: [1d0f:8061] type 00 class 0x010802 [ 7264.738095] pci 0000:00:1e.0: reg 0x10: [mem 0x00000000-0x00003fff] [ 7264.740240] pci 0000:00:1e.0: BAR 0: assigned [mem 0xc0004000-0xc0007fff] [ 7264.741625] nvme nvme2: pci function 0000:00:1e.0 [ 7264.742590] nvme 0000:00:1e.0: enabling device (0000 -> 0002) [ 7264.749773] nvme nvme2: 2/0/0 default/read/poll queues [ 7266.492232] EXT4-fs (nvme2n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 7266.750104] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 7266.751300] IPv6: ADDRCONF(NETDEV_CHANGE): cali463da3abb6d: link becomes ready [ 7266.868805] cali463da3abb6d: Caught tx_queue_len zero misconfig [ 7744.088838] pci 0000:00:1e.0: [1d0f:8061] type 00 class 0x010802 [ 7744.090192] pci 0000:00:1e.0: reg 0x10: [mem 0x00000000-0x00003fff] [ 7744.092261] pci 0000:00:1e.0: BAR 0: assigned [mem 0xc0004000-0xc0007fff] [ 7744.093560] nvme nvme2: pci function 0000:00:1e.0 [ 7744.094472] nvme 0000:00:1e.0: enabling device (0000 -> 0002) [ 7744.101861] nvme nvme2: 2/0/0 default/read/poll queues [ 7744.303017] pci 0000:00:1d.0: [1d0f:8061] type 00 class 0x010802 [ 7744.304229] pci 0000:00:1d.0: reg 0x10: [mem 0x00000000-0x00003fff] [ 7744.306344] pci 0000:00:1d.0: BAR 0: assigned [mem 0xc0008000-0xc000bfff] [ 7744.307817] nvme nvme3: pci function 0000:00:1d.0 [ 7744.308713] nvme 0000:00:1d.0: enabling device (0000 -> 0002) [ 7744.315784] nvme nvme3: 2/0/0 default/read/poll queues [ 7752.950471] EXT4-fs (nvme3n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 7752.975055] EXT4-fs (nvme2n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [ 7753.408247] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 7753.409443] IPv6: ADDRCONF(NETDEV_CHANGE): cali567ced57124: link becomes ready [ 7753.529160] cali567ced57124: Caught tx_queue_len zero misconfig [ 8073.447239] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 8073.448418] IPv6: ADDRCONF(NETDEV_CHANGE): cali9ca0ec69921: link becomes ready [ 8073.574736] cali9ca0ec69921: Caught tx_queue_len zero misconfig [10173.593686] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [10173.594940] IPv6: ADDRCONF(NETDEV_CHANGE): cali9632dfdcf6d: link becomes ready [10173.721447] cali9632dfdcf6d: Caught tx_queue_len zero misconfig [15161.728619] pci 0000:00:1c.0: [1d0f:8061] type 00 class 0x010802 [15161.729819] pci 0000:00:1c.0: reg 0x10: [mem 0x00000000-0x00003fff] [15161.731911] pci 0000:00:1c.0: BAR 0: assigned [mem 0xc000c000-0xc000ffff] [15161.733260] nvme nvme4: pci function 0000:00:1c.0 [15161.734198] nvme 0000:00:1c.0: enabling device (0000 -> 0002) [15161.740658] nvme nvme4: 2/0/0 default/read/poll queues [15163.264333] EXT4-fs (nvme4n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. [15163.485730] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [15163.486986] IPv6: ADDRCONF(NETDEV_CHANGE): cali463da3abb6d: link becomes ready [15163.602942] cali463da3abb6d: Caught tx_queue_len zero misconfig [15361.757816] pci 0000:00:1b.0: [1d0f:8061] type 00 class 0x010802 [15361.759343] pci 0000:00:1b.0: reg 0x10: [mem 0x00000000-0x00003fff] [15361.761998] pci 0000:00:1b.0: BAR 0: assigned [mem 0xc0010000-0xc0013fff] [15361.763661] nvme nvme5: pci function 0000:00:1b.0 [15361.764815] nvme 0000:00:1b.0: enabling device (0000 -> 0002) [15361.772616] nvme nvme5: 2/0/0 default/read/poll queues [15363.388972] EXT4-fs (nvme5n1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none. ```

Here are CPU details via lscpu:

"lscpu" output ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz CPU family: 6 Model: 85 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 7 BogoMIPS: 4999.99 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc c puid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch in vpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves id a arat pku ospke Virtualization features: Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 256 KiB (8 instances) L1i: 256 KiB (8 instances) L2: 8 MiB (8 instances) L3: 35.8 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerabilities: Itlb multihit: KVM: Mitigation: VMX unsupported L1tf: Mitigation; PTE Inversion Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Meltdown: Mitigation; PTI Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Retbleed: Vulnerable Spec store bypass: Vulnerable Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected Srbds: Not affected Tsx async abort: Not affected ```

On this machine, the pstore facility remains unavailable to us.

seh commented 2 years ago

Running with GRO and GSO enabled with the MTU for the eth0 interface back up at 9,000, this time using Flatcar Container Linux version 3227.2.2 and kernel version 5.15.63 atop the "z1d.12xlarge" EC instance type, the kernel bug does arise, but now we're getting more diagnostic output, per the following log fragment.

dmesg output ``` [Mon Oct 10 17:05:08 2022] calib2e12cb0dcc: Caught tx_queue_len zero misconfig [Mon Oct 10 18:22:24 2022] ------------[ cut here ]------------ [Mon Oct 10 18:22:24 2022] kernel BUG at net/core/skbuff.c:4219! [Mon Oct 10 18:22:24 2022] invalid opcode: 0000 [#1] SMP PTI [Mon Oct 10 18:22:24 2022] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.15.63-flatcar #1 [Mon Oct 10 18:22:24 2022] Hardware name: Amazon EC2 z1d.12xlarge/, BIOS 1.0 10/16/2017 [Mon Oct 10 18:22:24 2022] RIP: 0010:skb_segment+0xc70/0xe80 [Mon Oct 10 18:22:24 2022] Code: 44 24 50 48 89 44 24 30 48 8b 44 24 10 48 89 44 24 50 e9 16 f7 ff ff 0f 0b 89 44 24 2c c7 44 24 4c 00 00 00 00 e9 44 fe ff ff <0f> 0b 0f 0b 0f 0b 41 8b 7d 74 85 ff 0f 85 91 01 00 00 49 8b 95 c0 [Mon Oct 10 18:22:24 2022] RSP: 0018:ffffa2d38c780838 EFLAGS: 00010246 [Mon Oct 10 18:22:24 2022] RAX: ffff8954dd8312c0 RBX: ffff89293fbde300 RCX: ffff8957bd3d2fa0 [Mon Oct 10 18:22:24 2022] RDX: 0000000000000000 RSI: ffff89293fbde2c0 RDI: ffffffffffffffff [Mon Oct 10 18:22:24 2022] RBP: ffffa2d38c780908 R08: 0000000000009db6 R09: 0000000000000000 [Mon Oct 10 18:22:24 2022] R10: 000000000000a356 R11: 000000000000a31a R12: 000000000000000b [Mon Oct 10 18:22:24 2022] R13: ffff892940566100 R14: 000000000000a31a R15: ffff891ad0e5c600 [Mon Oct 10 18:22:24 2022] FS: 0000000000000000(0000) GS:ffff8948b9b80000(0000) knlGS:0000000000000000 [Mon Oct 10 18:22:24 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Mon Oct 10 18:22:24 2022] CR2: 000000c011faf000 CR3: 0000000d66a0a001 CR4: 00000000007706e0 [Mon Oct 10 18:22:24 2022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Mon Oct 10 18:22:24 2022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Mon Oct 10 18:22:24 2022] PKRU: 55555554 [Mon Oct 10 18:22:24 2022] Call Trace: [Mon Oct 10 18:22:24 2022] [Mon Oct 10 18:22:24 2022] ? csum_block_add_ext+0x20/0x20 [Mon Oct 10 18:22:24 2022] ? reqsk_fastopen_remove+0x190/0x190 [Mon Oct 10 18:22:24 2022] tcp_gso_segment+0xec/0x4e0 [Mon Oct 10 18:22:24 2022] inet_gso_segment+0x15e/0x3e0 [Mon Oct 10 18:22:24 2022] skb_mac_gso_segment+0x9c/0x110 [Mon Oct 10 18:22:24 2022] __skb_gso_segment+0xb2/0x160 [Mon Oct 10 18:22:24 2022] ? netif_skb_features+0x9c/0x2d0 [Mon Oct 10 18:22:24 2022] validate_xmit_skb.constprop.0+0x139/0x2b0 [Mon Oct 10 18:22:24 2022] validate_xmit_skb_list+0x41/0x70 [Mon Oct 10 18:22:24 2022] sch_direct_xmit+0x11c/0x250 [Mon Oct 10 18:22:24 2022] __dev_queue_xmit+0x8bd/0xb10 [Mon Oct 10 18:22:24 2022] ip_finish_output2+0x277/0x550 [Mon Oct 10 18:22:24 2022] ? ip_route_input_rcu+0x164/0x2d0 [Mon Oct 10 18:22:24 2022] ? skb_gso_validate_network_len+0x11/0x80 [Mon Oct 10 18:22:24 2022] ? __ip_finish_output+0xe9/0x1a0 [Mon Oct 10 18:22:24 2022] ip_sublist_rcv_finish+0x6b/0x70 [Mon Oct 10 18:22:24 2022] ip_sublist_rcv+0x16e/0x1f0 [Mon Oct 10 18:22:24 2022] ? ip_sublist_rcv+0x1f0/0x1f0 [Mon Oct 10 18:22:24 2022] ip_list_rcv+0xf8/0x120 [Mon Oct 10 18:22:24 2022] __netif_receive_skb_list_core+0x24a/0x270 [Mon Oct 10 18:22:24 2022] netif_receive_skb_list_internal+0x19f/0x2c0 [Mon Oct 10 18:22:24 2022] ? inet_gro_complete+0xaf/0x100 [Mon Oct 10 18:22:24 2022] napi_gro_complete.constprop.0.isra.0+0x112/0x170 [Mon Oct 10 18:22:24 2022] dev_gro_receive+0x2d5/0x6a0 [Mon Oct 10 18:22:24 2022] napi_gro_receive+0x62/0x1d0 [Mon Oct 10 18:22:24 2022] 0xffffffffc069d699 [Mon Oct 10 18:22:24 2022] ? scheduler_tick+0xb8/0x230 [Mon Oct 10 18:22:24 2022] __napi_poll+0x2a/0x150 [Mon Oct 10 18:22:24 2022] net_rx_action+0x250/0x2a0 [Mon Oct 10 18:22:24 2022] __do_softirq+0xcf/0x286 [Mon Oct 10 18:22:24 2022] irq_exit_rcu+0x99/0xc0 [Mon Oct 10 18:22:24 2022] common_interrupt+0x80/0xa0 [Mon Oct 10 18:22:24 2022] [Mon Oct 10 18:22:24 2022] [Mon Oct 10 18:22:24 2022] asm_common_interrupt+0x21/0x40 [Mon Oct 10 18:22:24 2022] RIP: 0010:cpuidle_enter_state+0xc7/0x350 [Mon Oct 10 18:22:24 2022] Code: 8b 3d f5 e1 9b 4d e8 08 bb a7 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 09 c9 a7 ff 45 84 ff 0f 85 fe 00 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 0a 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d [Mon Oct 10 18:22:24 2022] RSP: 0018:ffffa2d38c527ea8 EFLAGS: 00000246 [Mon Oct 10 18:22:24 2022] RAX: ffff8948b9bac100 RBX: 0000000000000003 RCX: 00000000ffffffff [Mon Oct 10 18:22:24 2022] RDX: 0000000000000006 RSI: 0000000000000006 RDI: 0000000000000000 [Mon Oct 10 18:22:24 2022] RBP: ffff8948b9bb6000 R08: 0000043f38b90644 R09: 0000043f6c0b1df3 [Mon Oct 10 18:22:24 2022] R10: 0000000000000014 R11: 0000000000000008 R12: ffffffffb3bbd7e0 [Mon Oct 10 18:22:24 2022] R13: 0000043f38b90644 R14: 0000000000000003 R15: 0000000000000000 [Mon Oct 10 18:22:24 2022] ? cpuidle_enter_state+0xb7/0x350 [Mon Oct 10 18:22:24 2022] cpuidle_enter+0x29/0x40 [Mon Oct 10 18:22:24 2022] do_idle+0x1e9/0x280 [Mon Oct 10 18:22:24 2022] cpu_startup_entry+0x19/0x20 [Mon Oct 10 18:22:24 2022] secondary_startup_64_no_verify+0xc2/0xcb [Mon Oct 10 18:22:24 2022] [Mon Oct 10 18:22:24 2022] Modules linked in: xt_CT ip_set_hash_net ip_set vxlan cls_bpf sch_ingress veth xt_comment xt_mark xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink nls_ascii nls_cp437 vfat fat mousedev intel_rapl_msr intel_rapl_common psmouse evdev i2c_piix4 i2c_core button sch_fq_codel fuse configfs ext4 crc16 mbcache jbd2 dm_verity dm_bufio aesni_intel nvme nvme_core libaes crypto_simd ena cryptd t10_pi crc_t10dif crct10dif_generic crct10dif_common btrfs blake2b_generic zstd_compress lzo_compress raid6_pq libcrc32c crc32c_generic crc32c_intel dm_mirror dm_region_hash dm_log dm_mod qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi br_netfilter bridge scsi_transport_iscsi stp llc overlay scsi_mod scsi_common [Mon Oct 10 18:22:24 2022] ---[ end trace 86a2732b8f4d0b13 ]--- ```
jepio commented 2 years ago

Thanks @seh, these kinds of logs are enough to start a discussion on lkml. I'll start a thread. Just to be sure I have all the facts straight: this is using ENA?

seh commented 2 years ago

If by ENA you mean Elastic Network Adapter, then I think the answer is yes. We didn't do anything deliberate to choose that, but running modinfo ena shows that the module is installed.

seh commented 2 years ago

My colleague @nbourikas disabled panic upon softlockup and was able to capture a more detailed failure trace atop kernel version 5.15.70 and Calico version 3.21.5.

dmesg output ``` [Tue Oct 11 22:44:47 2022] ------------[ cut here ]------------ [Tue Oct 11 22:44:47 2022] kernel BUG at net/core/skbuff.c:4218! [Tue Oct 11 22:44:47 2022] invalid opcode: 0000 [#1] SMP PTI [Tue Oct 11 22:44:47 2022] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.15.70-flatcar #1 [Tue Oct 11 22:44:47 2022] Hardware name: Amazon EC2 z1d.12xlarge/, BIOS 1.0 10/16/2017 [Tue Oct 11 22:44:47 2022] RIP: 0010:skb_segment+0xc71/0xe50 [Tue Oct 11 22:44:47 2022] Code: ab 01 00 00 49 8b 97 c0 00 00 00 49 8b 8f c8 00 00 00 45 89 6f 70 48 29 d1 89 c8 44 01 e9 41 89 8f b8 00 00 00 e9 60 fe ff ff <0f> 0b 48 8b 5c 24 60 8b 7c 24 28 4c 89 7b 08 85 ff 0f 84 ac 00 00 [Tue Oct 11 22:44:47 2022] RSP: 0018:ffffb1f34c7ac818 EFLAGS: 00010246 [Tue Oct 11 22:44:47 2022] RAX: ffff9d5f890ff6c0 RBX: ffff9d61bafa6b00 RCX: 0000000000000000 [Tue Oct 11 22:44:47 2022] RDX: ffff9d5901bc6900 RSI: ffff9d61bafa6ac0 RDI: ffffffffffffffff [Tue Oct 11 22:44:47 2022] RBP: ffffb1f34c7ac8e8 R08: 0000000000008196 R09: 0000000000000000 [Tue Oct 11 22:44:47 2022] R10: 0000000000008286 R11: 0000000000008290 R12: 0000000000000005 [Tue Oct 11 22:44:47 2022] R13: 0000000000008286 R14: ffff9d5901bc6c00 R15: ffff9d581730ac00 [Tue Oct 11 22:44:47 2022] FS: 0000000000000000(0000) GS:ffff9d85f9bc0000(0000) knlGS:0000000000000000 [Tue Oct 11 22:44:47 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Tue Oct 11 22:44:47 2022] CR2: 000000c0149cd000 CR3: 0000004ef7c0a001 CR4: 00000000007706e0 [Tue Oct 11 22:44:47 2022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Tue Oct 11 22:44:47 2022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Tue Oct 11 22:44:47 2022] PKRU: 55555554 [Tue Oct 11 22:44:47 2022] Call Trace: [Tue Oct 11 22:44:47 2022] [Tue Oct 11 22:44:47 2022] ? csum_block_add_ext+0x20/0x20 [Tue Oct 11 22:44:47 2022] ? reqsk_fastopen_remove+0x190/0x190 [Tue Oct 11 22:44:47 2022] tcp_gso_segment+0xec/0x500 [Tue Oct 11 22:44:47 2022] ? bpf_prog_2e6f5613f50238c5_calico_to_host_ep+0xa40/0x2cc8 [Tue Oct 11 22:44:47 2022] inet_gso_segment+0x15e/0x3e0 [Tue Oct 11 22:44:47 2022] skb_mac_gso_segment+0x9a/0x110 [Tue Oct 11 22:44:47 2022] __skb_gso_segment+0xb2/0x160 [Tue Oct 11 22:44:47 2022] ? netif_skb_features+0x9c/0x2d0 [Tue Oct 11 22:44:47 2022] validate_xmit_skb.constprop.0+0x137/0x2b0 [Tue Oct 11 22:44:47 2022] validate_xmit_skb_list+0x41/0x70 [Tue Oct 11 22:44:47 2022] sch_direct_xmit+0x11c/0x250 [Tue Oct 11 22:44:47 2022] __dev_queue_xmit+0x8f0/0xb70 [Tue Oct 11 22:44:47 2022] ? nf_ct_deliver_cached_events+0x6c/0x90 [nf_conntrack] [Tue Oct 11 22:44:47 2022] ip_finish_output2+0x274/0x540 [Tue Oct 11 22:44:47 2022] ? xt_compat_flush_offsets+0x14/0x70 [Tue Oct 11 22:44:47 2022] ? skb_gso_validate_network_len+0x11/0x80 [Tue Oct 11 22:44:47 2022] ? __ip_finish_output+0xe9/0x1a0 [Tue Oct 11 22:44:47 2022] ip_sublist_rcv_finish+0x6b/0x70 [Tue Oct 11 22:44:47 2022] ip_sublist_rcv+0x16e/0x1f0 [Tue Oct 11 22:44:47 2022] ? ip_sublist_rcv+0x1f0/0x1f0 [Tue Oct 11 22:44:47 2022] ip_list_rcv+0xf8/0x120 [Tue Oct 11 22:44:47 2022] __netif_receive_skb_list_core+0x224/0x250 [Tue Oct 11 22:44:47 2022] netif_receive_skb_list_internal+0x194/0x2b0 [Tue Oct 11 22:44:47 2022] ? inet_gro_complete+0xae/0xf0 [Tue Oct 11 22:44:47 2022] napi_gro_complete.constprop.0.isra.0+0x112/0x170 [Tue Oct 11 22:44:47 2022] dev_gro_receive+0x2d2/0x690 [Tue Oct 11 22:44:47 2022] napi_gro_receive+0x62/0x1d0 [Tue Oct 11 22:44:47 2022] 0xffffffffc0497687 [Tue Oct 11 22:44:47 2022] ? ip_local_deliver_finish+0x49/0x60 [Tue Oct 11 22:44:47 2022] ? __netif_receive_skb_one_core+0x8b/0xa0 [Tue Oct 11 22:44:47 2022] __napi_poll+0x2a/0x150 [Tue Oct 11 22:44:47 2022] net_rx_action+0x250/0x2a0 [Tue Oct 11 22:44:47 2022] __do_softirq+0xd0/0x286 [Tue Oct 11 22:44:47 2022] irq_exit_rcu+0x99/0xc0 [Tue Oct 11 22:44:47 2022] common_interrupt+0x80/0xa0 [Tue Oct 11 22:44:47 2022] [Tue Oct 11 22:44:47 2022] [Tue Oct 11 22:44:47 2022] asm_common_interrupt+0x22/0x40 [Tue Oct 11 22:44:47 2022] RIP: 0010:cpuidle_enter_state+0xc7/0x350 [Tue Oct 11 22:44:47 2022] Code: 8b 3d 05 5a 9c 5e e8 38 20 a8 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 39 2e a8 ff 45 84 ff 0f 85 fe 00 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 0a 01 00 00 49 63 d6 4c 2b 2c 24 48 8d 04 52 48 8d [Tue Oct 11 22:44:47 2022] RSP: 0018:ffffb1f34c52fea8 EFLAGS: 00000246 [Tue Oct 11 22:44:47 2022] RAX: ffff9d85f9bec100 RBX: 0000000000000003 RCX: 00000000ffffffff [Tue Oct 11 22:44:47 2022] RDX: 0000000000000006 RSI: ffffffffa62c41c0 RDI: 0000000000000000 [Tue Oct 11 22:44:47 2022] RBP: ffff9d85f9bf6000 R08: 0000021df8b89a47 R09: 0000021e84f874f3 [Tue Oct 11 22:44:47 2022] R10: 0000000000000017 R11: 000000000000000b R12: ffffffffa2dbd720 [Tue Oct 11 22:44:47 2022] R13: 0000021df8b89a47 R14: 0000000000000003 R15: 0000000000000000 [Tue Oct 11 22:44:47 2022] ? cpuidle_enter_state+0xb7/0x350 [Tue Oct 11 22:44:47 2022] cpuidle_enter+0x29/0x40 [Tue Oct 11 22:44:47 2022] do_idle+0x1e0/0x270 [Tue Oct 11 22:44:47 2022] cpu_startup_entry+0x19/0x20 [Tue Oct 11 22:44:47 2022] secondary_startup_64_no_verify+0xc2/0xcb [Tue Oct 11 22:44:47 2022] [Tue Oct 11 22:44:47 2022] Modules linked in: xt_CT ip_set_hash_net ip_set vxlan cls_bpf sch_ingress veth xt_comment xt_mark xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink nls_ascii nls_cp437 vfat fat mousedev intel_rapl_msr intel_rapl_common psmouse evdev i2c_piix4 i2c_core button sch_fq_codel fuse configfs ext4 crc16 mbcache jbd2 dm_verity dm_bufio nvme aesni_intel nvme_core libaes ena crypto_simd cryptd t10_pi crc_t10dif crct10dif_generic crct10dif_common btrfs blake2b_generic xor zstd_compress lzo_compress raid6_pq libcrc32c crc32c_generic crc32c_intel dm_mirror dm_region_hash dm_log dm_mod qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi br_netfilter bridge scsi_transport_iscsi stp llc overlay scsi_mod scsi_common [Tue Oct 11 22:44:47 2022] ---[ end trace 4c1f3c2045158b27 ]--- [Tue Oct 11 22:44:47 2022] RIP: 0010:skb_segment+0xc71/0xe50 [Tue Oct 11 22:44:47 2022] Code: ab 01 00 00 49 8b 97 c0 00 00 00 49 8b 8f c8 00 00 00 45 89 6f 70 48 29 d1 89 c8 44 01 e9 41 89 8f b8 00 00 00 e9 60 fe ff ff <0f> 0b 48 8b 5c 24 60 8b 7c 24 28 4c 89 7b 08 85 ff 0f 84 ac 00 00 ```

@tomastigera and @fasaxc, note this frame in that call stack, in between tcp_gso_segment and inet_gso_segment:

bpf_prog_2e6f5613f50238c5_calico_to_host_ep+0xa40/0x2cc8

jepio commented 2 years ago

@seh, would you be able to test with calico 3.23? This PR https://github.com/projectcalico/calico/pull/5753/files makes calico stop changing gso_size on vxlan decapsulation, which lkml suggests might be the cause (https://lore.kernel.org/netdev/194f6b02-8ee7-b5d7-58f3-6a83b5ff275d@gmail.com/).

seh commented 2 years ago

Thank you for the suggestion. Yes, we've been testing Calico version 3.23.3 over the last couple of days together with Flatcar's beta version 3346.1.0. So far, we haven't been hitting this kernel bug. I'll have more confidence after another day or two of testing.

seh commented 2 years ago

Apparently our testing did not tell the full story. It was a late one last night.

We've now seen this same kernel failure occur using Calico 3.23.2 with Flatcar Container Linux 3346.1.0 (kernel version 5.15.70) and Ubuntu 22.04.1 ("Jammy Jellyfish") (kernel version 5.15.0). The line number in file skbuff.c moves by one from 4218 to 4217 in the Ubuntu image. Disabling GRO and GSO again alleviates the rebooting problem for the moment, still at great cost for network performance.

That confirms for us that the problem is not specific to Flatcar Container Linux, but it does seem to be related to Calico's eBPF data plane.

vojtechDB commented 2 years ago

same issue kernel BUG at net/core/skbuff.c:4082 on Red Hat Enterprise Linux release 8.6 (Ootpa) with 4.18.0-372.26.1.el8_6.x86_64

Calico's eBPF data plane enabled. I agree with you @seh

seh commented 2 years ago

Just to make sure we're following along on this side, did you all see Jiri's candidate patch that he mentioned in https://github.com/projectcalico/calico/issues/6865#issuecomment-1286936333?

vojtechDB commented 2 years ago

with Jiri's candidate patch I'm not able to reproduce the issue anymore 4.18.0-372.26.1.el8_6.BZ_2136229_test_V1.x86_64

vojtechDB commented 2 years ago

The fix has been proposed in kernel upstream https://patchwork.kernel.org/project/netdevbpf/patch/559cea869928e169240d74c386735f3f95beca32.1666858629.git.jbenc@redhat.com/

pothos commented 2 years ago

@jepio has built patches images: https://bincache.flatcar-linux.net/images/amd64/3346.1.99+issue-378-fix/ and @seh is testing them, maybe for others following that also may be interesting

seh commented 2 years ago

So far, after five hours running with both GRO and GSO enabled, the machine (EC2 instance of type "z1d.12xlarge") has not crashed yet. Another machine running Flatcar Container Linux beta version 3346.1.0 and the same configuration otherwise (same EC2 instance type, same AZ, same workload) fails at least twice every hour.

jepio commented 2 years ago

The patch is queued up in netdev/next - as soon as it lands in linus' tree it can be submitted to stable. https://lore.kernel.org/netdev/166753501670.4086.1819802414418539212.git-patchwork-notify@kernel.org/#t

seh commented 1 year ago

I see that the patch is present along Linux's "master" branch and is tagged with "v6.1-rc5" as of three days ago.

jepio commented 1 year ago

This patch is in 5.15.79, which is in beta as of yesterday (3417.1.0).

@seh, want to verify and then we'll close this issue at last?

seh commented 1 year ago

We've been using this fix for about six weeks now with noticing any of these failures occurring. I consider this problem to be fixed. Thank you for all of your help with this one. It was quite a journey.