hyperhq / hyperstart

The tiny Init service for HyperContainer
https://www.hypercontainer.io
Apache License 2.0
134 stars 63 forks source link

Fix guest kernel crashed when hyperstart handle SETUPROUTE #342

Closed Weichen81 closed 7 years ago

Weichen81 commented 7 years ago

When we're running runv without containerd with kvmtool. There is not network support. But hyperstart still get a command to do SETUPROUTE. But in this case the route list is empty. We hadn't checked this case, a NULL pointer reference had crashed the guest kernel.

hyper_ctlmsg_handle SETUPROUTE init[1]: unhandled level 2 translation fault (11) at 0x00000000, esr 0x92000006 pgd = ffffffc005767000 [00000000] pgd=0000000085769003, pud=0000000085769003 , *pmd=0000000000000000 CPU: 0 PID: 1 Comm: init Not tainted 4.9.36 #3 Hardware name: linux,dummy-virt (DT) task: ffffffc00744ad00 task.stack: ffffffc00744c000 PC is at 0x406ba8 LR is at 0x4079f0 pc : [<0000000000406ba8>] lr : [<00000000004079f0>] pstate: 60000000 sp : 0000007ffec98cb0 x29: 0000007ffec98cb0 x28: 0000000000000000 x27: 000000000042c000 x26: 000000002f2131c0 x25: 0000000000000015 x24: 0000000000416000 x23: 000000000042c000 x22: 0000007ffec99170 x21: 000000000042c000 x20: 000000000042c000 x19: 0000000000000000 x18: 0000000000000001 x17: 0000007f95058988 x16: 000000000042c2a8 x15: 0000000000000001 x14: 0000000000000003 x13: 0000000000417b58 x12: 00000000ffffffff x11: 000000000000000a x10: 0000000000000000 x9 : 0000000000000001 x8 : 00000000ffffffff x7 : 0000000000000002 x6 : 000000002f2131d0 x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 000000000042c448 x1 : 000000000042c640 x0 : 0000000000000000 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Just add the empty route list check to fix this bug.

Jimmy-Xu commented 7 years ago

Can one of the admins verify this patch?

gao-feng commented 7 years ago

This looks wired, we call hyper_parse_setup_routes before hyper_setup_route, and if parse routes returns the number of router rules is zero, hyperstart will not try to call hyper_setup_route, and if parse_routes returns non-zero router rules, the memory of rt should be allocated, no null pointer reference will occur...

Weichen81 commented 7 years ago

@gao-feng Because the latest runv could not work properly with kvmtool. I have to rebase the the runv to commit: 164d08db0f7358c9a011e282217a0d30e7791ea3

I can reproduce this panic very time. The command line I had used is : runv --driver kvmtool --kernel kernel_aarch64 --initrd hyper-initrd.img --debug run {name}

The following are logs: runv.ubuntu.root.log.INFO.20171026-104552.25756.log runv.ubuntu.root.log.INFO.20171026-104553.25828.log runv.ubuntu.root.log.INFO.20171026-104706.26193.log runv.ubuntu.root.log.INFO.20171026-104707.26265.log runv.ubuntu.root.log.INFO.20171026-104817.26317.log

gao-feng commented 7 years ago

@Weichen81 Sorry I cannot reproduce this panic...

The logs show that hyper kernel panic when setting up route, there must be something wrong with runv/hyperstart. can you add more logs in hyperstart to trace the root cause of the null memory reference problem? I cannot point out the root cause after review the hyperstart code...

Thanks!

Weichen81 commented 7 years ago

@gao-feng Hi Feng, After some debug, I found this bug is cause by a uninitialized memory. When hyperstart could not find any route from json, the r_num will not be assigned a value. And the function's return value is 0. On my test server, the r_num will use a random non-zero number, so the hyper_setup_route will be invoked. And the guest kernel will be crashed.

Regards, Wei Chen

gao-feng commented 7 years ago

oops, this is an obvious bug, thanks for this fix!