Closed shawtao closed 3 years ago
Thank you for bringing your security concerns to our attention! We will investigate these immediately and follow up with you within 5 business days to provide a status.
Quick update: we're still working through this, will get back on this thread in a few days.
Thank you for reporting this issue. Please note that Firecracker customers should not report potential security issues via GitHub. Instead, please follow our security disclosure policy [3] to submit such reports confidentially. With this in mind, we’ve confirmed that this behavior does not represent a security issue within Firecracker. Additionally, AWS Lambda, AWS Fargate and Firecracker on Arm64 are not affected by this issue. More information is below.
On x86 CPUs, Firecracker microVMs use KVM PIT emulated devices which create a kernel thread, kvm-pit, used for injecting timer interrupts. The kvm-pit kernel thread work results in host CPU usage which is by default not constrained by the Jailer/Firecracker cgroup. That kernel thread is created when needed by kvm, and by default is part of the root cgroup. Its CPU overhead is limited by default in kvm to 5000 events per second [1]. In our measurements on EC2 .metal hosts, we found no overhead under normal usage, and found a maximum overhead of up to approximately 3% of one CPU core per microVM at the 5000 event per second limit.
We are aware of 5 options to constrain the CPU overhead that can be consumed by kvm-pit kernel threads on x86 CPUs:
a. Use an external agent to move the kvm-pit/ kernel thread in the microVM’s cgroup (e.g., the cgroup created by the Jailer). This cannot be done by Firecracker since the thread is created by the Linux kernel after guest start, at which point Firecracker is de-privileged.
b. Configure the kvm limit to a lower value [2]. This is a system-wide configuration available to users without Firecracker or Jailer changes. However, the same limit applies to APIC timer events, and users will need to test the impact on workloads in order to apply this mitigation.
c. Implement PIT emulation in Firecracker.
d. Apply a rate limit to the PIT interrupt frequency within Firecracker.
e. Disable the PIT emulation altogether, some time after the guest workload starts.
We recommend options [a] or [b] for users that want to avoid this potential overhead. We don’t think option [e] (recommended in the “desired solution” section above) is appropriate; since Firecracker cannot introspect guest workloads, we cannot guarantee that using this option will prevent additional effects on guest workloads.
To ensure wider awareness of these options, we will shortly add this topic and recommendation to our documentation. Please let us know if you have any other questions or concerns.
[1] https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt
[2] To modify the kvm limit for interrupts that can be injected in a second:
To have this change persistent across boots, we can append the option to /etc/modprobe.d/kvm.conf
echo "options kvm min_timer_period_us=" >> /etc/modprobe.d/kvm.conf
[3] https://github.com/firecracker-microvm/firecracker/blob/main/SECURITY.md
Feature Request
As far as I know, in addition to the PIT timer used during the boot process of the guest kernel, the Firecracker uses
kvm-clock
as theclocksource
andlapic
as theclockevent
after the kernel is started and does not use the PIT timer. ( I don't know if I'm missing some scenarios where the PIT timer must be used). But Firecracker still provides the Guest with port 0x40~0x43 to create a pit timer, which may bring potential security issues.The potential security issue
Description
KVM module will create a
kvm-pit
kernel thread to inject the PIT timer interrupt. When a root user in the Guest creates a periodic pit timer by writing ports 0x40~0x43, it will trigger the periodic injection interrupt of thekvm-pit
thread. Once the period set by the user is very short, it will cause thekvm-pit
thread and the Firecracker process itself to generate a certain amount of CPU load.Impact
Although the KVM module has a Variable
min_period
to limit the PIT timer period, I tried to create a periodic PIT timer withmin_period
and caused thekvm-pit
kernel thread to continuously take up to 6% CPU and the firecracker process to take up to 80% CPU, even though I do nothing in the Guest.Therefore, malicious root users in the Guest may use the PIT timer to generate some out-of-band workload and affect the performance of the host system.
Environment
Firecracker version.
Host and kernel version
Rootfs used: Not relevant
Architecture:
x86_64
Additional
Although this issue has a very limited impact, since there is no longer a need to use PIT timers in Firecracker after the kernel starts, why not just disable the creation of PIT timers.
Describe the desired solution
There is a simple way to forbidden the creation of PIT timers, function
create_pit_timer
in kvm moduleIn lines 329~331, the PIT timer can't be created if the flag is set to
KVM_PIT_FLAGS_HPET_LEGACY
. Qemu uses this flag to forbid the creation of PIT timers once it enables HPET emulation.Although Firecracker does not provide HPET emulation, is it also possible to use
ioctl
to set this flag at some point after the Guest kernel is booted, thus disabling the creation of the PIT timers.Describe possible alternatives
Additional context
I don't know if I'm missing some scenarios where the PIT timer has to be used. If the PIT timer has to be used, is the workload it causes within acceptable limits in Firecracker?
Checks