Crate Addition Request: vmm vcpu

jennymankin commented 5 years ago

Crate Name

vmm-vcpu

Short Description

A crate to provide a hypervisor-agnostic interface to common Virtual-CPU functionality

Why is this crate relevant to the rust-vmm project?

Regardless of hypervisor, container technologies utilize a virtual CPU, and the functions of a VCPU tend to be shared across hypervisors used. For example, VCPUs require functions to get and set registers and MSRs, get and set local APIC state, run until the next Vmexit, etc. Current container implementations (Firecracker, Crosvm, libwhp) use hypervisor-specific VCPUs to accomplish this functionality, but a shared VCPU abstraction would allow for more generic container code. It would also facilitate clean abstractions for other crates; for example, the proposed arch crate relies on a VCPU trait abstraction to provide a hypervisor-agnostic arch crate without losing the performance-optimized primitives that each technology relies on.

Design

Design 1

The vmm-vcpu crate itself is quite simple, requiring only exposing a public Vcpu trait with the functions that comprise common VCPU functionality:

Note: The signatures below are in-progress and subject to change. A design goal is to keep them matching and/or as close to matching the existing signatures found in Firecracker/crosvm to minimize code refactoring. So some of these may change more toward that direction if they haven't already

pub trait Vcpu {
    fn set_fpu(&self, fpu: &Fpu) -> Result<()>;
    fn set_msrs(&self, msrs: &MsrEntries) -> Result<()>;
    fn set_sregs(&self, sregs: &SpecialRegisters) -> Result<()>;
    fn run(&mut self) -> Result<VcpuExit>;
    fn get_run_context(&mut self) -> &mut RunContext;
    fn setup_regs(&mut self, ip: u64, sp: u64, si: u64) -> Result<()>;
    fn get_regs(&self) -> Result<VmmRegisters>;
    fn set_regs(&self, regs: &VmmRegisters) -> Result<()>;
    fn get_sregs(&self) -> Result<SpecialRegisters>;
    fn get_lapic(&self) -> Result<LApicState>;
    fn set_lapic(&mut self, klapic: &LApicState) -> Result<()>;
    fn set_cpuid(&self, cpuid_entries: &[CpuIdEntry]) -> Result<()>;
}

While the data types themselves (VmmRegisters, SpecialRegisters) are exposed via the trait with generic names, under the hood they can be kvm_bindings data structures, which are also exposed from the same crate via public redefinitions:

pub use kvm_bindings::kvm_regs as VmmRegisters;
pub use kvm_bindings::kvm_sregs as SpecialRegisters; 
// ...

Per-hypervisor implementations of the VCPU trait would not reside within the vmm-vcpu crate itself, but would be contained either in other rust-vmm crates, or in crates hosted elsewhere. For example, the proposed and currently in-PR kvm-ioctl, which already refactors out the VcpuFd implementation from its previous home in kvm/src/lib.rs, would require minimal refactoring to utilize the new vmm-vcpu crate, moving its VCPU implementations from the straight VcpuFd implementation to that implementing the trait. For example, from:

pub struct VcpuFd {
    vcpu: File,
    kvm_run_ptr: KvmRunWrapper,
}

impl Vcpu {
    /// ...
    #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
    pub fn set_fpu(&self, fpu: &kvm_fpu) -> Result<()> {
        let ret = unsafe {
            // Here we trust the kernel not to read past the end of the kvm_fpu struct.
            ioctl_with_ref(self, KVM_SET_FPU(), fpu)
        };
        if ret < 0 {
            return Err(io::Error::last_os_error());
        }
        Ok(())
    }
}

To:

pub struct VcpuFd {
    vcpu: File,
    kvm_run_ptr: KvmRunWrapper,
}

impl Vcpu for VcpuFd {
    #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
    fn set_fpu(&self, fpu: &kvm_fpu) -> Result<()> {
        let ret = unsafe {
            // Here we trust the kernel not to read past the end of the kvm_fpu struct.
            ioctl_with_ref(self, KVM_SET_FPU(), fpu)
        };
        if ret < 0 {
            return Err(io::Error::last_os_error());
        }
        Ok(())
    }
    // Other functions of the Vcpu trait
}

Similarly, the Rust Hyper-V crate libwhp would implement its side of the Vcpu trait:

pub struct WhpVcpu {
    partition: Rc<RefCell<PartitionHandle>>,
    index: UINT32,
}

impl Vcpu for WhpVcpu {
    fn set_fpu(&self, fpu: &Fpu) -> Result<(), io::Error> {
        let _reg_names: [WHV_REGISTER_NAME; 4] = [
            WHV_REGISTER_NAME::WHvX64RegisterFpControlStatus,
            WHV_REGISTER_NAME::WHvX64RegisterXmmControlStatus,
            WHV_REGISTER_NAME::WHvX64RegisterXmm0,
            WHV_REGISTER_NAME::WHvX64RegisterFpMmx0,
        ];

        let mut reg_values: [WHV_REGISTER_VALUE; 4] = Default::default();
        reg_values[0].Reg64 = fpu.fcw as UINT64;
        reg_values[1].Reg64 = fpu.mxcsr as UINT64;
        reg_values[2].Fp = WHV_X64_FP_REGISTER {
            AsUINT128: WHV_UINT128 {
                Low64: 0,
                High64: 0,
            },
        };
        reg_values[3].Fp = WHV_X64_FP_REGISTER {
            AsUINT128: WHV_UINT128 {
                Low64: 0,
                High64: 0,
            },
        };
        self.set_registers(&reg_names, &reg_values)
            .map_err(|_| io::Error::last_os_error())?;

        Ok(())
    }
    // Implement other functions of the Vcpu trait
}

Alternative Design

It was also discussed that a VCPU is unlikely to change between compilations (ie, a single build is likely to contain a KVM VCPU or a Hyper-V VCPU, but not both). Since traits are more useful when multiple implementations of a trait may be present in the same build, we internally discussed whether a trait-based crate is over-engineering for the problem, and something like conditional compilation of common APIs is a better approach. Acknowledging that a trait-based crate would enforce the API contract more strictly, but perhaps with some data visibility lost within the functions as functions consuming trait generics can't access member variables of the structs they are implementing. We solicit opinions from the rust-vmm community as to whether a trait-based or conditional compilation (or even a hybrid, trait-based implementation to enforce the contract but conditional compilation calling the methods directly) approach makes the most sense for this problem space.

bonzini commented 5 years ago

I like the idea of exposing a simple hypervisor-agnostic API that can start as just "pub use" and can later migrate to marshaling/unmarshaling between hypervisor-specific and structs and generic ones.

I think we should have fine-grained traits and make it possible to compile them out.

jennymankin commented 5 years ago

Thank you for the feedback. I like your suggestion on conditional-compilation of fine-grained traits between hypervisor versions.

Regarding your point on "pub use", are you referring to the use of "pub use" to expose data structures in the design document example above? (That is, with kvm_binding data structures exposed via "pub use", and kvm-related code using the data structures directly and WHP/other hypervisors taking the exposed structures and marshalling them manually?) Or should there be a third generic struct, and in addition to WHP marshalling data, the KVM APIs would also marshall data to and from this generic struct into the kvm_bindings structs? (Or some other suggestion that I'm missing?)

yisun-git commented 5 years ago

As what I replied to Paolo in rust-vmm mail list, I have similar idea as yours to implement hypervisor agnostic vcpu framework. My idea is to abstract a "hypervisor" trait and implement it for KVM/HyperV/etc. Vcpu calls the common interfaces. Based on that, I will implement hypervisor agnostic cpu-model to support customized cpu models.

As our ideas are similar, I think we can work together to make things better. I noticed you already had Rust Hyper-V crate libwhp. Do you mainly work for Hyper-V? Do you have plan to implement KVM too? If no, I may help to implement KVM part.

jennymankin commented 5 years ago

Hi @yisun-git , I agree, the ideas are similar so it would be great to work together. I am interested in understanding more about your proposed implementation, particularly if it would provide a more natural approach. The modular design I envision would have a VMM crate calling the (abstracted) functions of the VCPU trait (rather than the other way around), but perhaps I am also stuck on what a VMM crate would look like based on existing VMM crates in Firecracker/crosvm (and based on our example Hyper-V implementation for libwhp). As you intuited in your email reply, one of our explicit goals is to minimize the refactoring that would be required for existing VMMs/container solutions like Firecracker/crosvm.

As you develop them, seeing more design details and/or example trait definitions for your VMM/CPU crates would, I think, definitely help us to understand your approach, as well as to help us evaluate the tradeoffs of both approaches as we move forward.

We do work mainly with Hyper-V, so it would indeed be nice to have your help in implementing the KVM part.

yisun-git commented 5 years ago

Sorry for late. I am just back from vacation.

Per my thought, our idea is to make hypervisor agnostic so we should abstract a hypervisor trait directly. It provides common functions, e.g. get_msr/set_msr/set_cpuid/etc. The specific hypervisor implementations, e.g. kvm/hyper-v/etc, are behind it. Then, the vcpu calls the functions in hypervisor trait. This needs the changes in vmm/vcpu/etc to replace kvm functions to hypervisor functions. But I think the changes should not be too many.

I am glad to implement KVM part. Looking forward to work with you together. :)

jennymankin commented 5 years ago

Update to design

For the trait definition, I pull all the functions currently in the kvm-ioctl's crate, since these are key functions and both KVM and Windows Hypervisor Platform (WHP) provide underlying APIs that can be used to implement each of these function. The trait is below; there is not much change from the original proposal, just some refinement as I've now done basic POC work on each for both WHP and kvm-ioctls (the former being the more work, given that the signatures were almost uniformly taken from kvm-ioctls).

pub trait Vcpu {
    type RunContextType;
    fn get_regs(&self) -> Result<VmmRegisters>;
    fn set_regs(&self, regs: &VmmRegisters) -> Result<()>;
    fn get_sregs(&self) -> Result<SpecialRegisters>;
    fn set_sregs(&self, sregs: &SpecialRegisters) -> Result<()>;
    fn get_fpu(&self) -> Result<Fpu>;
    fn set_fpu(&self, fpu: &Fpu) -> Result<()>;
    fn set_cpuid2(&self, cpuid: &CpuId) -> Result<()>;
    fn get_lapic(&self) -> Result<LApicState>;
    fn set_lapic(&self, klapic: &LApicState) -> Result<()>;
    fn get_msrs(&self, msrs: &mut MsrEntries) -> Result<(i32)>;
    fn set_msrs(&self, msrs: &MsrEntries) -> Result<()>;
    fn run(&self) -> Result<VcpuExit>;
    fn get_run_context(&self) -> Self::RunContextType;
}

Some questions for the rust-vmm community:

Additional Vcpu functions

In additional to these functions pulled from kvm-ioctls, there are additional Vcpu functions that could be useful to include as well, based on the larger set provided by WHP and KVM and implemented by libwhp and crosvm (respectively) (Firecracker Vcpu functions are already fully represented). There are two options:

Include these in the Vcpu trait, and any hypervisors not implementing these functions could opt-out using (for example) the unimplemented!(); macro. Eg, get/set MP state, get/set xsave state, etc.
Keep only this smaller subset for now, and any Vcpus that want to implement additional functions can do so alongside the impl Vcpu trait implementation.

Arch-independence

All of the functions above (except for run/get_run_context) are conditionally-compiled for x86/x64 in their previous forms (crosvm, Firecracker, kvm-ioctls). The same conditional compilation can be applied here in the traits. It doesn't leave a lot of ARM functionality implemented, but I assume that can be added later (and similarly blocked with conditional compilation) as needed for that use case.

Home of reference implementations

Where should reference implementations of these traits live? Obviously kvm-ioctls is a good location for the KVM-based implementation of these traits; as shown in the design above, there is little code refactoring required to make this change. Currently our libwhp implementation (which provides Rust bindings/examples) for Hyper-V lives in a different github space. It probably makes sense to keep that library there, and make changes there to implement these traits, unless the rust-vmm community here, and the other libwhp developers think the implementation belongs more in the rust-vmm repo.

@yisun-git, I am particularly interested in your feedback on the design here, and whether this is the set of common functions you also had in mind, or whether you'd like to see more. From your last comment, I think it sounds like we are on the same page as to what this Vcpu trait should provide, though you're calling it a VMM trait and I'm limiting it here to a Vcpu (where the Vcpu traits could then be used as a building block for a later VMM crate). I'm still tryin got understand whether we envision using the traits in the same way. Given my description of the changes to kvm-ioctls and WHP in the original issue statement above, as well as my description of the use of such a vcpu function in my proposal for the arch crate here, is this what you also had in mind or do you see the trait implementations being done differently?

andreeaflorescu commented 5 years ago

LGTM.

yisun-git commented 5 years ago

@jennymankin , I just left a message to you on slack. I added an issue on vmm-vcpu crate and present my ideas. Please check it when you are free. :)

rust-vmm / community