Crate Addition Request: extend vmm-vcpu to Hypervisor crate

Crate Name

Hypervisor

Short Description

vmm-vcpu has made Vcpu handling be hypervisor agnostic. But there are still some works to do to make whole rust-vmm be hypervisor agnostic. So here is a proposal to extend vmm-vcpu to Hypervisor crate to make rust-vmm be hypervisor agnostic. There has been an issue to discuss this: https://github.com/rust-vmm/vmm-vcpu/issues/5.

To make larger audience see this, I create this new issue here per Jenny's suggestion.

Hypervisor crate abstracts different hypervisors interfaces (e.g. kvm ioctls) to provide unified interfaces to upper layer. The concrete hypervisor (e.g. Kvm/ HyperV) implements the traits to provide hypervisor specific functions.

The upper layer (e.g. Vmm) creates Hypervisor instance which links to the running hypervisor. Then, it calls running hypervisor interfaces through Hypervisor instance to make the upper layer be hypervisor agnostic.

Why is this crate relevant to the rust-vmm project?

Rust-vmm should be workable for all hypervisors, e.g. KVM/HyperV/etc. So the hypervisor abstraction crate is necessary to encapsulate the hypervisor specific operations so that the upper layer can simplify the implementations to be hypervisor agnostic.

Design

Relationships of crates

Compilation arguments Create concrete hypervisor instance for Hypervisor users (e.g. Vmm) through compilation argument. Because only one hypervisor is running for cloud scenario.

Hypervisor crate This crate itself is simple to expose three public traits Hypervisor, Vm and Vcpu. This crate is used by KVM/HyperV/etc. The interfaces defined below are used to show the mechanism. They are got from Firecracker. They are more Kvm specific. We may change them per requirements.

Note: The Vcpu part refers the [1] and [2] with some changes.

pub trait Hypervsior {
    pub fn create_vm(&self) -> Box<Vm>;
    pub fn get_api_version(&self) -> i32;
    pub fn check_extension(&self, c: Cap) -> bool;
    pub fn get_vcpu_mmap_size(&self) -> Result<usize>;
    pub fn get_supported_cpuid(&self, max_entries_count: usize) -> Result<CpuId>;
}

pub trait Vm {
    pub fn create_vcpu(&self, id: u8) -> Box<Vcpu>;
    pub fn set_user_memory_region(&self,
                                  slot: u32,
                                  guest_phys_addr: u64,
                                  memory_size: u64,
                                  userspace_addr: u64,
                                  flags: u32) -> Result<()>;
    pub fn set_tss_address(&self, offset: usize) -> Result<()>;
    pub fn create_irq_chip(&self) -> Result<()>;
    pub fn create_pit2(&self, pit_config: PitConfig) -> Result<()>;
    pub fn register_irqfd(&self, evt: &EventFd, gsi: u32) -> Result<()>;
}

pub trait Vcpu {
    pub fn get_regs(&self) -> Result<VmmRegs>;
    pub fn set_regs(&self, regs: &VmmRegs) -> Result<()>;
    pub fn get_sregs(&self) -> Result<SpecialRegisters>;
    pub fn set_sregs(&self, sregs: &SpecialRegisters) -> Result<()>;
    pub fn get_fpu(&self) -> Result<Fpu>;
    pub fn set_fpu(&self, fpu: &Fpu) -> Result<()>;
    pub fn set_cpuid2(&self, cpuid: &CpuId) -> Result<()>;
    pub fn get_lapic(&self) -> Result<LApicState>;
    pub fn set_lapic(&self, klapic: &LApicState) -> Result<()>;
    pub fn get_msrs(&self, msrs: &mut MsrEntries) -> Result<(i32)>;
    pub fn set_msrs(&self, msrs: &MsrEntries) -> Result<()>;
    pub fn run(&self) -> Result<VcpuExit>;
}

[1] While the data types themselves (VmmRegs, SpecialRegisters, etc) are exposed via the trait with generic names, under the hood they can be kvm_bindings data structures, which are also exposed from the same crate via public redefinitions:

pub use kvm_bindings::kvm_regs as VmmRegs;
pub use kvm_bindings::kvm_sregs as SpecialRegisters;
// ...

Sample codes to show how it works

Kvm crate Below are sample codes in Kvm crate to show how to implement above traits.

pub struct Kvm {
    kvm: File,
}

impl Hypervisor for Kvm {
    pub fn create_vm(&self) -> Box<Vm> {
        let ret = unsafe { ioctl(&self.kvm, KVM_CREATE_VM()) };
        let vm_file = unsafe { File::from_raw_fd(ret) };
        Box::new(KvmVmFd { vm: vm_file, ...})
    }

    ...
}

struct KvmVmFd {
    vm: File,
    ...
}

impl Vm for KvmVmFd {
    pub fn create_irq_chip(&self) -> Result<()> {
        let ret = unsafe { ioctl(self, KVM_CREATE_IRQCHIP()) };
        ...
    }

    pub fn create_vcpu(&self, id: u8) -> Result<Vcpu> {
        let vcpu_fd = unsafe { ioctl_with_val(&self.vm,
                                              KVM_CREATE_VCPU(),
                                              id as c_ulong) };
        ...
        let vcpu = unsafe { File::from_raw_fd(vcpu_fd) };
        ...
        Ok(Box::new(KvmVcpuFd { vcpu, ... }))
    }

    ...
}

pub struct KvmVcpuFd {
    vcpu: File,
    ...
}

impl Vcpu for KvmVcpuFd {
    ...
}

Vmm crate Below are sample codes in Vmm crate to show how to work with Hypervisor crate.

struct Vmm {
    hyp: Box<Hypervisor>,
    ...
}

impl Vmm {
    fn new(h: Box<Hypervisor>, ...) -> Self {
        Vmm {hyp: h}
        ...
    }
    ...
}

pub struct GuestVm {
    fd: Box<Vm>,
    ...
}

impl GuestVm {
    pub fn new(hyp: Box<Hypervisor>) -> Result<Self> {
        let vm_fd = hyp.create_vm();
        ...
        let cpuid = hyp.get_supported_cpuid(MAX_CPUID_ENTRIES);
        ...
        Ok(GuestVm {
            fd: vm_fd,
            supported_cpuid: cpuid,
            guest_mem: None,
        })
    }
    ...
}

pub struct GuestVcpu {
    fd: Box<Vcpu>,
    ...
}

impl GuestVcpu {
    pub fn new(id: u8, vm: &GuestVm) -> Result<Self> {
        let vcpu = vm.fd.create_vcpu(id);
        Ok(GuestVcpu { fd: vcpu, ... }
    }
    ...
}

When start Vmm, create concrete hypervisor instance according to compilation argument. Then, set it to Vmm and start the flow: create guest vm -> create guest vcpus -> run.

References: [1] rust-vmm/community#40 [2] https://github.com/rust-vmm/vmm-vcpu

Bringing over our discussion from https://github.com/rust-vmm/vmm-vcpu/issues/5 so that we can resume it with this audience:

jennymankin commented 5 hours ago

Some thoughts:

I like the VMM and the VM traits you've outlined and the functionality they provide. I see these working nicely with the VCPU trait and providing a nice hierarchy of functionality and solid building blocks for building hypervisors.
I see dynamic dispatch being utilized in the create functions of Hypervisor and Vm. I think these might be better served using static dispatch and trait generics, rather than "box"ing the references to the other trait implementations. The overhead introduced by dynamic dispatch and the virtual functions may not be necessary, and as you said, there will only be one type or implementation of hypervisor <-> VM <-> VCPU (as built at the highest level through conditional compilation) for each scenario. A statically-dispatched return value of a Trait generic can be achieved with the impl Trait syntax. For example:

pub trait Vcpu { 
    // …
}
pub trait Vm {
    pub fn create_vcpu(&self, id: u8) -> impl Vcpu;
    // …
}

I think each of these traits would be better implemented as a separate crate, so that each piece can be modularly used as VMM building blocks. That would make the VM crate a nice next step to implement, as the next highest-level crate (it would consume the VCPU).

yisun-git commented 3 hours ago Hi, @jennymankin,

Thanks for your comments! The suggestion to convert dynamic dispatch to static dispatch is good. Let me have a try!

For separate crate, there are dependencies I think. I.e. Hypervisor depends on Vm trait, Vm depends on Vcpu trait. Is that possible a concrete VMM (e.g. Firecracker/crosvm) only implements part of these traits but not all of them? So I am not sure if we should separate hypervisor/vm/vcpu traits. How do you think?

Thanks a lot for your open mind on this issue! I will raise this issue to community to see if other guys have any comments.

jennymankin commented a minute ago Hi @yisun-git, yup, you are correct that the Hypervisor crate would depend on the Vm crate, and the Vm crate on the Vcpu crate. But I can still see VMM implementations implementing lower-level traits but not the higher-level ones. For example, I would as a first pass convert the libwhp Hyper-V crate to only use the Vcpu to start, since its reference hypervisor code is implemented differently than all the functionality provided by Firecracker/crosvm. Other projects as well might find the lower-level crates useful as well, as building blocks without pulling in the whole thing.

Thanks!

Do we have any update on this one? Did you discuss it during the last sync meeting?

My concern with this crate is that the vCPU interface for Hyper-V and KVM as well as x86 and arm doesn't really have many common functions that can be shared. I would like to understand how this is going to be used by other crates. During PTG we tried to come up with crates that would benefit from this interface. One example we took was the cpuid crate which would offer functionality for setting the guest cpu model. The only functions that could be used from the vCPU interface would be set_cpuid and get_cpuid (maybe). But these are available only on x86 I believe. So my question would be: can we have a better abstraction here? Instead of using a vCPU trait can we instead have a Cpuid trait that can be implemented by various hypervisors? The Cpuid trait would then offer an interface to get and set the cpuid in a platform and hypervisor specific way.

Hi @andreeaflorescu,

I discussed this issue with Jenny at here and the vmm-vcpu issue. As many people did not attend last meeting, we did not discuss it during the last meeting.

I have completed prototype codes based on Firecracker for this issue. The Hypervisor crate uses vmm-vcpu crate as part of it. If you'd like to see how other crates use the Hypervisor crate, I can upload the codes. But I did not implement Hyper-V part.

Your suggestion is very good. In fact, I am thinking to abstract the things in smaller granularity to be more suitable for different hypervisors. But I need know some details about Hyper-V implementation to do better abstraction. Can you provide some reference codes or document? BTW, I don't think x86 and arm have big differences because KVM or Hyper-V should provide same ioctls for both platforms.

One more thing is to address Zach's comment about VcpuExit, I think we have to implement a hypervisor specific vcpu_exit_handling() in vmm crate. But the other codes should be common without hypervisor specific changes. Even with this non-elegant change, I still think the Hypervisor crate can benefit the whole project much because most parts of codes (arch/vmm/cpuid/etc) will be hypervisor agnostic. Some trade off are needed, like many other projects.

Hi @andreeaflorescu,

There's been some discussion on the PR itself as to how a vCPU abstraction would be used in other crates; for example I've argued that it's a quite clean abstraction to use in crates like the architecture-specific arch crate. We are also working on prototyping crosvm to use Hyper-V, which will also give us a sense for how useful the abstraction might be as part of larger VMM implementations.

There might be something that can be done for a Cpuid trait, although I'll need to think about it further. Cpuid is actually handled quite differently on Hyper-V and KVM. Where (as you know) on KVM each CPUID result for a given function/index can be set on the vCPU level, for WHP it must be configured when the VM is configured but before vCPUs are created for that VM (and thus the CPUID results that are set during VM configuration are the same for all vCPUs on that VM). Additionally, individual vCPU results can be intercepted and modified (since CPUID causes a vCPU exit) on WHP. But anyway, there still might be something useful there, I'll continue to think about it.

@yisun-git I'd also be interested in seeing the Firecracker prototype for the crate(s) proposed here. As for Hyper-V/Windows Hypervisor Platform details, the libwhp project implements the Rust bindings and higher-level functionality APIs, as well as a fully fleshed-out example. I've also extended this crate to implement the traits of the vCPU crate in a POC branch. The the documentation from Microsoft also provides some overview of WHP, but is pretty sparse and not very informative. So I'd be happy to discuss it in more detail sometime if you have more questions about it.

Hi @andreeaflorescu, @jennymankin

I just uploaded the draft codes to implement Hypervisor crate which includes Jenny's Vcpu change. I planned to refine these draft codes but I have not had time to do it. So there are some messy codes. Sorry for that.

The codes locate at: https://github.com/yisun-git/Hypervisor_on_FC/tree/dbg_yisun1_hypervisor

rust-vmm / community