[Platforms] Add support for CPUID customization

yisun-git commented 5 years ago

Although Firecracker has supported CPUID and provided templates, it only supports EC2 C3 and EC2 T2 instance types. Per discussion with Andreea, it seems not easy to add more templates especially customized templates.

But customized CPU model is important because of below reasons.

Avoid CPU hardware vulnerabilities.
Keep stable guest ABI.
Hard requirement for live migration.

So I propose to provide a generic framework which has standard interfaces and flexible mechanism to support customized CPU models.

andreeaflorescu commented 5 years ago

I am much in favour of this approach. Although I do have to say that we will probably not have the time to work on this any time soon. If you want to pick this up I can help along the way.

Do you have any design in mind? I am thinking about the following questions:

How does the user pass the CPU Template? Is it going to be a huge JSON sent as part of the machine-config API call? Or do you want to pass it as a configuration file?
How would a template look like? Whitelist vs blacklist? Depending on the approach it might be a relatively complicated problem to solve. One thing to keep in mind is that we will most likely not have any emulated cpu features in Firecracker. So if one template is asking to enable a feature which is not supported by hardware, nor emulated by KVM the guest might report as supported an unsupported feature.

Anyway, I find this a interesting problem to solve!

46bit commented 5 years ago

Does this limit Firecracker to EC2 instances? If so, why such a limitation–is the emulated CPUID required for the guest to understand what instructions are available?

yisun-git commented 5 years ago

I am much in favour of this approach. Although I do have to say that we will probably not have the time to work on this any time soon. If you want to pick this up I can help along the way.

Thank you! I'd like to have a try.

Do you have any design in mind?

Yes, I have got some ideas.

First, I want to start from refining FC's cpuid module to provide generic framework to handle x86 cpu model related resources, cpuid/msr/etc. (Other architectures are not considered yet.) Per my observation, I think FC's current cpuid module can be refined for below things.

Some settings are hard code.
Combined with KVM but not hypervisor agnostic.
Too many codes to set a template. Is there a simple way?
Only support 2 ASW instance types. Need more cpu models.
CPUID configuration is not flexible. A dynamic setting method is needed.

So I plan to do below things.

Define a structure (or trait?) to cover all the resources. Different x86 models create different structure instances.
Then, users can set what they expect by adding new instance in code or providing a new configuration file. The CPUID configuration should be done according to the input without any hard code.
Furthermore, a parameter in cmdline can be added to make user dynamically add/remove feature.
To make hypervisor agnostic, define a new structure to provide common ioctl interfaces. The CPU model calls these interfaces. The hypervisor module implements the functions.

Second, I want to extend above CPU model to handle CPU states, address space, interrupts, etc. The goal is also generic and hypervisor agnostic.

Last, I'd like to implement cpu hotplug, live migration capabilities.

I am thinking about the following questions:

How does the user pass the CPU Template? Is it going to be a huge JSON sent as part of the machine-config API call? Or do you want to pass it as a configuration file?

Personally, I like a configuration file or a template in code. For template in code, I expect the template codes should be very simple to just add a few codes, e.g. a model string, the CPUID settings.

How would a template look like? Whitelist vs blacklist?

I think there may be different ways. So far, I have two ideas.

One template for one model. E.g. one template for Skylake, one for Broadwell, etc. The pros is that it is clear for specific model. The cons is that there are lots of redundant contents.
Provide a full CPU feature list in sequence. One model has a simple configuration file or in code setting to show which features it wants. Something like white-list and blacklist.

Depending on the approach it might be a relatively complicated problem to solve. One thing to keep in mind is that we will most likely not have any emulated cpu features in Firecracker. So if one template is asking to enable a feature which is not supported by hardware, nor emulated by KVM the guest might report as supported an unsupported feature.

Yes, I see. There will be CPUID filter. We can compare user settings with host enabled features. Only the common features are reported to Guest.

Anyway, I find this a interesting problem to solve!

yisun-git commented 5 years ago

Does this limit Firecracker to EC2 instances? If so, why such a limitation–is the emulated CPUID required for the guest to understand what instructions are available?

Current FC's cpuid can only support EC2 C3 and EC2 T2 instance types. My plan is to remove such limitation. I guess the reason of this limitation is that FC should support AWS first. :)

andreeaflorescu commented 5 years ago

Does this limit Firecracker to EC2 instances? If so, why such a limitation–is the emulated CPUID required for the guest to understand what instructions are available?

Firecracker can be run on any bare metal instance and even with nested virtualization. We only implemented two templates that are indeed specific to EC2 instances, but this doesn't mean that there is any limitation in running Firecracker on any machine that has access to /dev/kvm. By default, Firecracker microVMs will report the CPUID of the host with minor changes like CPU topology, Cache topology and some other disabled features (like PMU).

The purpose of this issue as I see it is to enhance the Firecracker cpuid crate to support other custom templates. Ideally these templates are not hardcoded in the code, but rather can be passed as configuration files through Firecracker API.

yisun-git commented 5 years ago

To make cpu-model be hypervisor agnostic, we need do some works to make Firecracker be hypervisor agnostic. So, I proposed a hypervisor agnostic solution on rust-vmm as below.

https://github.com/rust-vmm/vmm-vcpu/issues/5

After this is done, we can implement other cpu-model functions described above.

raduweiss commented 5 years ago

Added this to the roadmap, under the "Researching" column.

xmarcalx commented 2 years ago

Removing it from the roadmap because we do not have any task related to it in a foreseeable future.

roypat commented 12 months ago

Since 1.4, firecracker has support for CPUID customization. See https://github.com/firecracker-microvm/firecracker/blob/main/docs/cpu_templates/cpu-templates.md#custom-cpu-templates for documentation.

firecracker-microvm / firecracker

[Platforms] Add support for CPUID customization #998