Closed Weichen81 closed 7 years ago
Can one of the admins verify this patch?
This looks wired, we call hyper_parse_setup_routes before hyper_setup_route, and if parse routes returns the number of router rules is zero, hyperstart will not try to call hyper_setup_route, and if parse_routes returns non-zero router rules, the memory of rt should be allocated, no null pointer reference will occur...
@gao-feng Because the latest runv could not work properly with kvmtool. I have to rebase the the runv to commit: 164d08db0f7358c9a011e282217a0d30e7791ea3
I can reproduce this panic very time. The command line I had used is : runv --driver kvmtool --kernel kernel_aarch64 --initrd hyper-initrd.img --debug run {name}
The following are logs: runv.ubuntu.root.log.INFO.20171026-104552.25756.log runv.ubuntu.root.log.INFO.20171026-104553.25828.log runv.ubuntu.root.log.INFO.20171026-104706.26193.log runv.ubuntu.root.log.INFO.20171026-104707.26265.log runv.ubuntu.root.log.INFO.20171026-104817.26317.log
@Weichen81 Sorry I cannot reproduce this panic...
The logs show that hyper kernel panic when setting up route, there must be something wrong with runv/hyperstart. can you add more logs in hyperstart to trace the root cause of the null memory reference problem? I cannot point out the root cause after review the hyperstart code...
Thanks!
@gao-feng Hi Feng, After some debug, I found this bug is cause by a uninitialized memory. When hyperstart could not find any route from json, the r_num will not be assigned a value. And the function's return value is 0. On my test server, the r_num will use a random non-zero number, so the hyper_setup_route will be invoked. And the guest kernel will be crashed.
Regards, Wei Chen
oops, this is an obvious bug, thanks for this fix!
When we're running runv without containerd with kvmtool. There is not network support. But hyperstart still get a command to do SETUPROUTE. But in this case the route list is empty. We hadn't checked this case, a NULL pointer reference had crashed the guest kernel.
hyper_ctlmsg_handle SETUPROUTE init[1]: unhandled level 2 translation fault (11) at 0x00000000, esr 0x92000006 pgd = ffffffc005767000 [00000000] pgd=0000000085769003, pud=0000000085769003 , *pmd=0000000000000000 CPU: 0 PID: 1 Comm: init Not tainted 4.9.36 #3 Hardware name: linux,dummy-virt (DT) task: ffffffc00744ad00 task.stack: ffffffc00744c000 PC is at 0x406ba8 LR is at 0x4079f0 pc : [<0000000000406ba8>] lr : [<00000000004079f0>] pstate: 60000000 sp : 0000007ffec98cb0 x29: 0000007ffec98cb0 x28: 0000000000000000 x27: 000000000042c000 x26: 000000002f2131c0 x25: 0000000000000015 x24: 0000000000416000 x23: 000000000042c000 x22: 0000007ffec99170 x21: 000000000042c000 x20: 000000000042c000 x19: 0000000000000000 x18: 0000000000000001 x17: 0000007f95058988 x16: 000000000042c2a8 x15: 0000000000000001 x14: 0000000000000003 x13: 0000000000417b58 x12: 00000000ffffffff x11: 000000000000000a x10: 0000000000000000 x9 : 0000000000000001 x8 : 00000000ffffffff x7 : 0000000000000002 x6 : 000000002f2131d0 x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 000000000042c448 x1 : 000000000042c640 x0 : 0000000000000000 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Just add the empty route list check to fix this bug.