AlexAltea commented 6 years ago

Opening this issue as follow-up to the discussion in qemu-devel: https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00030.html

Summarizing: Guest debugging is extremely relevant to debugging bootloaders/microkernels, or in my case, kernels that do not include debugging backends. Aside from the intrinsic value of this feature, it's also really beneficial to developers/researchers as neither WHPX (Windows) nor HVF (macOS) support guest debugging and while they remain closed-source this probably won't change.

This week I've started experimenting with guest debugging support on HAXM, but it's still too early to submit any patches (plus, this should probably coordinated with QEMU somehow). These are the key ideas (inspired by KVM):

Adding a HAX_VCPU_IOCTL_DEBUG ioctl (i.e. QEMU-to-HAXM) with a structure containing:
- Flags to enable/disable debugging and specific features: This is meant to manage the guest state accordingly: e.g. if guest debugging via hardware breakpoints is enabled we should protect the guest drN values in the exit_dr_access handler. Additionally, to know where to redirect #BP, e.g. if software breakpoints are enabled forward to QEMU, otherwise forward to VM.
- Values for debugging registers: dr0, dr1, dr2, dr3, dr6, dr7: They already are a self-contained way of describing hardware breakpoints, so no need to wrap that information in another layer of abstraction.
Adding a HAX_EXIT_DEBUG exit (i.e. HAXM-to-QEMU) with a structure containing:
- Cause: Software breakpoint, hardware breakpoint, singlestep.
- Value of: rip: For software breakpoint events.
- Value of: dr6, dr7: For hardware breakpoint events.
Software breakpoints are managed by QEMU as an array of objects containing an address and the original byte at that instruction before patching int 3h (0xCC). Its lifecycle is:
1. GDB: Sends break * command.
2. QEMU: Creates/initializes hax_sw_brakpoint object and patches instruction with int 3h.
3. HAXM: Runs the virtual machine, and returns to QEMU after hitting #BP.
4. QEMU: Returns control to GDB.
5. GDB: Sends continue command.
6. QEMU: Reverts the instruction patch and single-steps.
7. HAXM: Executes one instruction and returns.
8. QEMU: Reapplies the instruction patch and resumes execution.
Hardware breakpoints are specified by directly writing into the guest drN register through the previously described ioctl.

Current experiments can be found in https://github.com/AlexAltea/orbital-qemu/commit/77d61c71800106ca15d6eb63d29327e95cd546fd (QEMU) and https://github.com/AlexAltea/haxm/tree/debug (HAXM). I'm facing some issues still:

Hardware breakpoints have no effect.
Software breakpoints trigger a triple fault, not a #BP.

Disclaimer: I haven't much experience with debuggers, so any feedback will be helpful.

raphaelning commented 6 years ago

Thanks! This is a very nice feature to have, and your summary really helps me understand how breakpoints work.

I'm facing some issues still:

Hardware breakpoints have no effect. Software breakpoints trigger a triple fault, not a #BP.

BP doesn't really cause a VM exit by default. There is a VMCS exception bitmap per CPU that controls which CPU exceptions should be handled by the host/hypervisor and which by the guest (see Intel SDM Vol. 3C: 25.2 Other Causes of VM Exits). So I believe you need to (conditionally) set the bits for `EXC_DEBUG` and `EXC_BREAK_POINT` [sic]:

https://github.com/intel/haxm/blob/e23a8dd04cc8c458517b7bd164b429706d7875d5/core/vcpu.c#L1187

and handle them properly in exit_exc_nmi() (core/vcpu.c).

In addition, if you search for "debug" in Intel SDM Vol. 3C, there are a lot more details about how hardware (VT) can help enable guest debugging support. 32.2: Virtualization Support for Debugging Facilities gives a good summary, and I've also made my own reading list:

24.4.2 Guest Non-Register State: The pending debug exceptions field in VMCS (GUEST_PENDING_DBE), plus 26.6.3 Delivery of Pending Debug Exceptions after VM Entry.
24.7.1 VM-Exit Controls: The save debug controls flag of the VM-exit controls field in VMCS (EXIT_CONTROL_SAVE_DEBUG_CONTROLS).
24.8.1 VM-Entry Controls: The load debug controls flag of the VM-entry controls field in VMCS (ENTRY_CONTROL_LOAD_DEBUG_CONTROLS).
27.2.1 Table 27-1: Exit Qualification for Debug Exceptions.

Ideally, we should support both the QEMU GDB (for debugging the guest itself) and GDB running in the guest (for debugging an app that runs in guest user space). I need to read more to understand how we can achieve this.

AlexAltea commented 6 years ago

Thank you, indeed I forgot to enable #DB and #BP in the exception bitmap. After doing so, software breakpoints work perfectly, although hardware breakpoints are still getting ignored.

This might be caused by wrong drN values: I've noticed the VMCS structure only offers a GUEST_DR7 member, but what about dr0, dr1, dr2, dr3, dr6? Do I need to load/save host/guest drN's manually? I've noticed the vcpu_set_regs handler allows updating such registers in vcpu->state->_dr*: https://github.com/intel/haxm/blob/e9f8c8908735f7b6e5ffa73ce643459ba5e8546b/core/vcpu.c#L3695-L3700

Similarly, exit_dr_access also allows changing the vcpu->state->_dr* registers.

However, I have not found anyhere in the codebase a mechanism to load such values into the guest debug registers... I'm assuming this is unimplemented: Should we include dr_dirty flag in vcpu_t and call set_dr* accordingly right before doing a VM-enter?

raphaelning commented 6 years ago

software breakpoints work perfectly

Great, congrats!

Here's my understanding of how hardware breakpoints roughly work:

GDB asks QEMU to insert a HW BP (I haven't checked this part).
QEMU prepares the data to be written to guest DR{i, 7} registers based on BP information, where i is one of {0, 1, 2, 3}, and DR7 is the Debug Control Register.
QEMU passes this data to hypervisor by invoking an ioctl on each vCPU.
At the next VM entry of each vCPU, hypervisor loads DR{i, 7} of the current host CPU with the data specified by QEMU.
One of the host CPUs hits the HW BP and takes a VM exit (instead of invoking the guest #DB exception handler).
Hypervisor returns to QEMU with information about the HW BP being triggered, including the contents of guest DR{6, 7}, where DR6 is the Debug Status Register.
Using the information provided by hypervisor, especially guest DR{6, 7}, QEMU identifies the GDB BP that corresponds to the HW BP.
QEMU returns control to GDB.

Based on this:

exit_dr_access is probably irrelevant now, since the workflow doesn't depend on guest reading/writing DR* itself. It will be when we want to support running GDB from within the guest.
Although vcpu_set_regs (i.e. the HAX_VCPU_SET_REGS ioctl) can be used to sync DR from QEMU to HAXM, it also syncs other registers, and therefore is not suitable for step 3 above. (But it may be useful for initializing DR at vCPU reset.) Moreover, for step 3, we don't really need all of DR*, but only DR{i, 7}, so we should use the new, dedicated ioctl.
For step 4, as you said, VT already helps us load DR7, but we still need to take care of DR{0, 1, 2, 3} (I'm not sure about DR6 yet). Note that:
- We only need to load DR{i} if the corresponding HW BP is enabled.
- We don't want the guest DR* to affect the host--what if a malicious user tries to set a breakpoint in HAXM code? I guess if the host DR7 doesn't enable DR{i}, it should be safe to load DR{i} at any time before the VM entry.
So is it possible for the host DR to be "active" while the user tries to debug the guest? E.g., a QEMU developer wants to debug QEMU GDB itself? If so, we do need to save/restore host DR.
For step 5, the VM exit handler can obtain guest DR7 from VMCS, and guest DR6 from Exit Qualification for Debug Exceptions.

AlexAltea commented 6 years ago

Thanks for your detailed overview on hardware breakpoints, I've added the missing features on https://github.com/AlexAltea/haxm/commit/f2808d8fa83754d2305e2df199e564069b659f39. Hardware breakpoints are now successfully triggered:

Hardware assisted breakpoint 1 at 0xffffffff825805b0

Thread 1 hit Breakpoint 1, 0xffffffff825805b0 in ?? ()
(gdb)

The only thing missing now is single-stepping, which for some reasons triggers a triple-fault. I'm trying to figure out why this happens without much success. As soon as I'm finished, I'll submit a PR for this feature.

So is it possible for the host DR to be "active" while the user tries to debug the guest? E.g., a QEMU developer wants to debug QEMU GDB itself? If so, we do need to save/restore host DR.

I think that issue cannot be solved by HAXM. As soon as such QEMU developer uses hardware breakpoints, the corresponding dr7 bits will be enabled, but before entering the guest VM, the guest dr0-dr3 registers are restored. What if the host instructions right afterwards are mapped to the same addresses pointed by any of the guest hardware breakpoints?

This could only be fixed at hardware-level by adding GUEST_DR0 to GUEST_DR3 to the VMCS fields. Meanwhile the only approach to debugging QEMU/HAXM, while debugging a virtual machine, is using hardware breakpoints only in either host or guest, and using software breakpoints everywhere else.

[...] guest DR6 from Exit Qualification for Debug Exceptions.

Thank you, that was helpful, and it does indeed cover dr6 register.

raphaelning commented 6 years ago

the only approach to debugging QEMU/HAXM, while debugging a virtual machine, is using hardware breakpoints only in either host or guest, and using software breakpoints everywhere else.

This sounds reasonable. How does GDB choose between inserting a hardware BP and inserting a software BP? If we can't rely on it to make the right decision in the hypothetical scenario, should we check host DR7 before restoring guest DR{0, 1, 2, 3}?

The only thing missing now is single-stepping, which for some reasons triggers a triple-fault. [...] As soon as I'm finished, I'll submit a PR for this feature.

Cool. I just skimmed through your code and spotted a couple of typos--not sure if they are actually related to the triple fault:

https://github.com/AlexAltea/haxm/blob/f2808d8fa83754d2305e2df199e564069b659f39/core/vcpu.c#L1377

https://github.com/AlexAltea/haxm/blob/f2808d8fa83754d2305e2df199e564069b659f39/core/vcpu.c#L1403

In both cases set_dr3() should be changed to set_dr6().

AlexAltea commented 6 years ago

How does GDB choose between inserting a hardware BP and inserting a software BP?

That's decided by the command entered by the user: break for software breakpoints, hbreak for hardware breakpoints.

Cool. I just skimmed through your code and spotted a couple of typos--not sure if they are actually related to the triple fault:

Thanks! Seems unrelated to the triple-fault, but worth fixing anyway. :-)

AlexAltea commented 5 years ago

This issue should be closed since #81 was merged.

intel / haxm

Guest debugging support #66