Open jennymankin opened 5 years ago
I'd like to take some more time to understand how would this fit with the kvm-ioctls and higher level crates before doing an in-depth review.
P.S. Please sign your commits. (git commit -s). Can you also add some more details to the commit description?
No problem @andreeaflorescu. If you're interested, I have this poc branch of porting kvm-ioctls to use this Vcpu crate and it is passing the tests: https://github.com/jennymankin/kvm-ioctls/tree/poc-vcpu (apologies for the poc quality/commenting out etc). As you can see the changes are minimal.
I will address your other comments asap.
So we discussed this at the rust-vmm ptg meetup. After talking for a couple hours on the subject of this trait, we don't think the various hypervisors (e.g. hyper-v, kvm) are similar enough to gain much benefit to this trait. Specifically, the VcpuExit structure is so different that any code making use of it would have to specialize for each trait impl, defeating the benefits. We would like to see a concrete example showing benefits of a Vcpu trait, perhaps firecracker or crosvm supporting to different hypervisors using the same code. Without that, my disposition would be to no accept this PR. An alternative would be to pick and choose Vcpu related code to break out into its own crate. For example, a hypervisor agnostic CPUID crate would give lots of benefits without having to agree on a common Vcpu trait.
Hi @zachreizner, thanks for summarizing your discussion at the PTG meetup since I wasn't able to make it. I'm sure over the couple hours of discussion you covered many aspects that led to this conclusion. Here are some of my thoughts.
You are certainly right in that the differences in the VcpuExit structure (due to the underlying vCPU exits exposed by each hypervisor) make it such that any code making use of the run()
function would need to specialize its processing of the exits based on hypervisor. This would need to either be accomplished directly at the layer performing the vCPU.run()
, or might be itself abstracted within a higher-level crate. For example, a hypervisor-agnostic VM
crate might utilize the trait generic (with VMM-specific references providing implementation of those operations). See, for example, the proposed issue to provide additional abstractions of a VM
and a VMM
that makes use of abstracted vCPU
functionality.
Getting crosvm/Firecracker to achieve parity with Hyper-V in addition to KVM is an ambitious goal, and it's true that doing so will require more layers than just swapping in a vCPU implementation of a generic trait. The specifics of what this would look like is something we'd like to look at, and focusing on/POCing the handling of the VcpuExit
is a good suggestion.
Stepping back from these more-ambitious goals, I think the vCPU crate still offers opportunity for abstraction for common VMM-related operations in higher-level crates that utilize common vCPU functionality. The arch
crate comes to mind. In development of the Hyper-V-based libwhp
crate, some of the arch
functionality had to be duplicated, stripped of KVM-specific objects and APIs, and imported as a separate libwhp
-specific crate. The duplication was one of the motivations behind my proposal of the arch
crate here for rust-vmm: it naturally lends itself to a hypervisor-agnostic solution that can be easily imported into different VMM projects. And as we discussed a couple weeks ago on the rust-vmm call, since those APIs accept the Vcpu trait generic as an input parameter, there is "zero cost" to the abstraction due to the static dispatch.
That is one example where the generic abstraction provided by the vCPU crate benefits other hypervisor-agnostic crates; I think it's reasonable to assume others exist. For example, we are also currently researching and developing a Windows loader crate; this makes use of these same vCPU APIs and abstraction implementations to set up the VM.
So independent of our goals to achieve interchangeable VMMs in ambitious projects like crosvm and Firecracker, I think that having generic crates abstracting lower-level functionality provides benefits to smaller-scale projects, like those that might be using rust-vmm crates as building blocks to their own VMMs.
Hi, @zachreizner
I am a little bit confused here. Why do you think different VcpuExit structure will defeat the abstraction benefit? Per my thought, the VcpuExit abstraction should be a set of all hypervisors exit reasons. The concrete Vcpu implementation should handle the exit reasons it cares. The differences are encapsulated into the concrete hypervisor crate, e.g kvm-ioctls. It doesn's affect upper layers. The only disadvantage is there are redundant exit reasons for specific hypervisor. But it should not be a big case I think.
Because we statically set concrete hypervisor in upper layer, there should be zero cost with Hypervisor/Vm/Vcpu abstractions and make the upper layer codes be generic and elegant. This brings much benefit to make the upper layer crates be platform portable. So far, we only have KVM/HyperV requirements. But there may be more and more hypervisors in the future.
Responding to @yisun-git: a union of the VcpuExit enum for all hypevisors would be inappropriate because some of the vcpu exits would actually lead to contradictions depending on the hypervisor one was using. For example, HyperV has an exit for CPUID that the VMM can use to fill in CPUID registers while KVM has no such exit, opting for a VM wide KVM_SET_CPUID
call. If I were writing crosvm/firecracker to use the proposed union VcpuExit, I would need to both the set the CPUID (a no-op on HyperV) and respond to the CPUID exit (an impossibility in KVM) or implement this particular feature in the vmm-vcpu crate, which would make this crate more high-level than intended. Having thought this trough, I can't see it providing benefits.
Responding to @jennymankin: I think we are on the same page in that there is plenty of arch related code that we would like shared, but disagree on the best way to achieve that goal. A Vcpu
trait at first seems obvious, but the more we went through the mental exercise of understanding how it might be used, the more holes we found. Hypervisors are too different to really make the Vcpu
trait work well. A better way to achieve the goal of sharing arch code is for the arch crate itself to export types that are consumed by hypervisor implementations. Using the setup_fpu
example:
// arch crate
struct Fpu { ... }
pub fn setup_fpu() -> Fpu {
Fpu {
fcw: 0x37f,
mxcsr: 0x1f80,
..Default::default()
};
}
// vmm binary using kvm (e.g. crosvm/firecracker)
pub fn setup_vcpu(vcpu: &KvmVcpu) -> Result<()> {
let fpu = arch::setup_fpu();
vcpu.setup_fpu(fpu)
}
// vmm binary using hyperv
pub fn setup_vcpu(vcpu: &HyperVcpu) -> Result<()> {
let fpu = arch::setup_fpu();
vcpu.setup_fpu(fpu)
}
The nice thing about this is it more cleanly separates the logic of setting up the Fpu
data structure in arch
from the concrete business of programming the hypervisor in an actual vmm implementation. This also means that it's not necessary for us to agree on a constellation of traits that each hypervisor must use, but don't really fit.
Hi @zachreizner, thanks for detailing the proposed alternative. I wanted to take some time to think through your suggestion.
What you're suggesting does make sense for a lot of the arch functionality, in which the goal of the function is to configure some stuff (state, registers, etc), and then set it on the vCPU. That can be cleanly ported to have the function return the configured structure back to the vCPU, and it can do the actual setting. In this sense, the arch crate provides glorified structure definitions and struct value setting, for a specific struct.
But I think this design breaks down for more complex functions, which might perform more than one operation on the vCPU (eg, get a value from the vCPU, manipulate it, set the new value). It moves the burden of managing each of these operations to each vCPU implementation itself. (Granted, the current arch crate is pretty small, but we could see more helper arch-related functionality being added in the future.) Not to mention each vCPU implementation gets cluttered with even the simple functions (like the setup_fpu
you cited), having to implement common functionality that could be easily abstracted behind a generic type. I just don't see the benefit of this, when the abstraction comes so easily (especially on KVM because it's already using the same data types). And since (virtual) CPU and architecture are directly related, I don't think there's inelegance to having these architecture-specific vCPU "helpers" take a generic reference to a vCPU.
And from the WHP side, having done the implementation of the trait, the effort isn't in finding similar vCPU functionality between different hypervisors--it's pretty easy to identify the core functionality shared between even these two very different hypervisors (where additional/non-overlapped functionality can be implemented separately). They get and set registers, state, MSRs, etc. The effort is in the translation between data structure types, which in this case, is a burden that falls on the WHP side (since we're standardizing on kvm data structures). This translation effort is present regardless of which approach is used; it's either done ad-hoc for each arch-level implementation function in the vCPU, or it's done once in primitive operations (like get_regs/set_regs) with the higher-level setup functions taking advantage of this translation functionality.
(And I know I'm leaning on the arch crate as example here, but I think it's a good example of a higher-level crate that provides useful functionality by consuming lower-level functionality.)
Secondarily, we (along with Cloudbase) will continue to work through the abstraction and whether or not it provides benefit for full VMM solutions like Firecracker/Crosvm.
This PR is for a simple crate that provides a trait abstraction of a virtual CPU (vCPU).
Each VMM can then provide their own implementation of the vCPU trait (for example, I have a implementation of the
Vcpu
trait for theVcpuFd
struct ofrust-vmm/kvm-ioctls
; when this PR is approved, I can make any last minute changes/tests to that POC and put it up for PR as well).Issue/design here