linuxkit / kubernetes

minimal and immutable Kubernetes images built with LinuxKit
Apache License 2.0
400 stars 75 forks source link

Kernel panic with "BUG: unable to handle kernel NULL pointer dereference at (null)" #80

Closed leoh0 closed 6 years ago

leoh0 commented 6 years ago

Description

A kernel panic occurs when two nodes connect with vxlan.

Steps to reproduce the issue:

Just follow this section. booting-and-initialising-os-images

$ make all

# boot master
$ ./boot.sh

# login to master
$ ./ssh_into_kubelet.sh 192.168.65.11 
# launch master
linuxkit-025000000009:/# kubeadm-init.sh

# join to master
$ ./boot.sh 1 192.168.65.11:6443 --token 43gcdz.40q62te3f1xprg7r --discovery-token-ca-cert-hash sha256:d79e5239a534ae0296410e0fdfa532664b92c65eda5dad7c2924d3ad05cb7313

# and kernel panic occurs

Describe the results you received:

When vxlan is connected then kernel panic is occured.

[  197.502285] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  197.503112] IP:           (null)
[  197.503630] PGD 8000000035950067 P4D 8000000035950067 PUD 3598e067 PMD 0
[  197.504438] Oops: 0010 [#1] SMP PTI
[  197.504815] Modules linked in: dummy vport_vxlan openvswitch xfrm_user xfrm_algo
[  197.505716] CPU: 0 PID: 2246 Comm: weaver Not tainted 4.14.32-linuxkit #1
[  197.506486] Hardware name:   BHYVE, BIOS 1.00 03/14/2014
[  197.507043] task: ffff9a3bb7715040 task.stack: ffffaeb081c64000
[  197.507862] RIP: 0010:          (null)
[  197.508239] RSP: 0018:ffffaeb081c67788 EFLAGS: 00010286
[  197.508784] RAX: ffffffff9d83a080 RBX: ffff9a3bb8214000 RCX: 00000000000005aa
[  197.509473] RDX: ffff9a3bb54b4200 RSI: 0000000000000000 RDI: ffff9a3bb5480400
[  197.510234] RBP: ffffaeb081c67880 R08: 0000000000000006 R09: 0000000000000002
[  197.510980] R10: 0000000000000000 R11: ffff9a3bb5469300 R12: ffff9a3bb54b4200
[  197.511763] R13: ffff9a3bb54804a8 R14: ffff9a3bb5469300 R15: ffff9a3bb8214040
[  197.512323] FS:  0000000003795880(0000) GS:ffff9a3bbe600000(0000) knlGS:0000000000000000
[  197.513038] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.513666] CR2: 0000000000000000 CR3: 00000000380b2004 CR4: 00000000000606b0
[  197.514417] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  197.515142] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  197.515865] Call Trace:
[  197.516147]  ? vxlan_xmit_one+0x4af/0x837
[  197.516568]  ? vxlan_xmit+0xb2f/0xb5a
[  197.516969]  ? vxlan_xmit+0xb2f/0xb5a
[  197.517427]  ? skb_network_protocol+0x55/0xb3
[  197.517884]  ? dev_hard_start_xmit+0xd0/0x194
[  197.518441]  ? dev_hard_start_xmit+0xd0/0x194
[  197.518938]  ? __dev_queue_xmit+0x47c/0x5c4
[  197.519437]  ? do_execute_actions+0x99/0x1069 [openvswitch]
[  197.519952]  ? do_execute_actions+0x99/0x1069 [openvswitch]
[  197.520454]  ? slab_post_alloc_hook.isra.52+0xa/0x1a
[  197.520999]  ? __kmalloc+0xc1/0xd3
[  197.521371]  ? ovs_execute_actions+0x77/0xfd [openvswitch]
[  197.521897]  ? ovs_execute_actions+0x77/0xfd [openvswitch]
[  197.522469]  ? ovs_packet_cmd_execute+0x1bb/0x230 [openvswitch]
[  197.523063]  ? genl_family_rcv_msg+0x2db/0x349
[  197.523433]  ? genl_rcv_msg+0x4e/0x69
[  197.523787]  ? genlmsg_multicast_allns+0xf1/0xf1
[  197.524209]  ? netlink_rcv_skb+0x97/0xe8
[  197.524700]  ? genl_rcv+0x24/0x31
[  197.525137]  ? netlink_unicast+0x11a/0x1b5
[  197.525676]  ? netlink_sendmsg+0x2e2/0x308
[  197.526192]  ? sock_sendmsg+0x2d/0x3c
[  197.526754]  ? SYSC_sendto+0xfc/0x138
[  197.527233]  ? do_syscall_64+0x69/0x79
[  197.527681]  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  197.528326] Code:  Bad RIP value.
[  197.528660] RIP:           (null) RSP: ffffaeb081c67788
[  197.529281] CR2: 0000000000000000
[  197.529634] ---[ end trace 391521052893e451 ]---
[  197.533425] Kernel panic - not syncing: Fatal exception in interrupt
[  197.534602] Kernel Offset: 0x1b000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  197.538715] Rebooting in 10 seconds..
[  207.586797] ACPI MEMORY or I/O RESET_REG.
FATA[0211] Cannot run hyperkit: exit status 2

Describe the results you expected:

without kernel panic

Additional information you deem important (e.g. issue happens only occasionally):

I just test linuxkit/linux@6a3c946 commit can fix this problem which is in 4.14.40 kernel.