clearcontainers / runtime

OCI (Open Containers Initiative) compatible runtime using Virtual Machines
Apache License 2.0
590 stars 70 forks source link

Documentation: Add low level debug method to our documentation #558

Open mcastelino opened 6 years ago

mcastelino commented 6 years ago

How to debug QEMU or Kernel issues for Clear Containers

Manual command line for Clear Container Q35 Launch. This lets you debug w/o needing the entire kubernetes or docker framework.

If you have clear containers installed you can get the kernel and rootfs image from /usr/share/clearcontainers

For q35 machine time

./x86_64-softmmu/qemu-system-x86_64 \
     -machine q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit,nofw \
     -m 256,maxmem=512M,slots=2 \
     -smp 2 \
     -nodefaults -rtc base=utc,driftfix=slew \
     -global kvm-pit.lost_tick_policy=discard \
     -kernel ./vmlinux.container  \
     -append "reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1 init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false pci=lastbus=0 nivablecore=20G debug" \
     -device virtio-serial-pci,id=virtio-serial0 \
     -chardev stdio,id=charconsole0 \
     -device virtconsole,chardev=charconsole0,id=console0 \
     -nographic \
     -object memory-backend-file,id=mem0,share,mem-path=./clear-containers.img,size=235929600 \
     -device nvdimm,memdev=mem0,id=nv0 -no-reboot

For PC Machine Type

Note the use of the compressed kernel in this case

./x86_64-softmmu/qemu-system-x86_64 -machine pc,accel=kvm,kernel_irqchip,nvdimm \
-m 256,maxmem=512M,slots=2 -smp 2 -nodefaults -rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=discard \
-kernel ./vmlinuz.container  \
-append "reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1 init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false pci=lastbus=0 nivablecore=20G debug" \
-device virtio-serial-pci,id=virtio-serial0 -chardev stdio,id=charconsole0 \
-device virtconsole,chardev=charconsole0,id=console0 -nographic \
-object memory-backend-file,id=mem0,share,mem-path=./clear-containers.img,size=235929600 \
-device nvdimm,memdev=mem0,id=nv0 -no-reboot

Tracing QEMU/KVM interactions

If the console log does not provide enough information, then it is time to trace the KVM and QEMU interactions.

Details on how to do that can be found here

https://gist.github.com/mcastelino/b31f0648707b25478eb2a44f94a861fd

QEMU Console debug

If that also does not suffice, then you need to introspect the VM using the QEMU monitor.

The easiest way to do the same would be over telnet

Append the following to the QEMU command line

-monitor telnet:127.0.0.1:1235,server,nowait

Now you can telnet to access the monitor

telnet 127.0.0.1 1235

The qemu monitor provides a variety of info commands to provide machine context. The full list can be obtained using the info command

The following are normally most useful

info pci
info tlb
info mem
info cpus
x

GDB

If you are still in a pickle... then its time to gdb the guest kernel. Ideally you want access to the QEMU console at the same time

Starting QEMU with gdb enabled

Append

-s 

to the QEMU command line. This will start QEMU with gdb enabled, but will not stop the execution of the kernel.

You can now attach gdb to the running kernel (which will stop execution at the time you attach

gdb ./vmlinux.container
(gdb) target remote localhost:1234

Note: vmlinux is the uncompressed kernel Note: If you start with the option "-s -S", then the kernel will be started in the stopped state.

Mapping instructions to the code

The Clear Containers kernel does not have CONFIG_DEBUG_INFO=y in the .config. For quick and dirty mapping you can perform

objdump -D vmlinux.container

To get the code listing. This is helpful to figure out where issues may be.

grahamwhaley commented 6 years ago

Excellent info - it's going to be great to get this (and other info) translated into a 'how to debug' reference .md file. Note for whoever ends up doing the translation - please pull the information out of the gist @mcastelino references into the document.