Solo5 / solo5

A sandboxed execution environment for unikernels
ISC License
891 stars 137 forks source link

virtio: implement ACPI power-off #499

Open mbacarella opened 2 years ago

mbacarella commented 2 years ago

Currently, when unikernels terminate they just kind of hang forever on GCE instances instead of powering off.

Additionally, when a GCE instance receives a stop command, it sends an ACPI power-off signal to the instance. On Solo5 it appears this is ignored, so the GCE hypervisor waits up to 90 seconds before actually shutting down the instance.

Not sure if this requires a full ACPI implementation that matches the underlying hardware running the server, which would be a huge ordeal, or if GCE provides a more virtualized form that can be harmlessly targeted on non-GCE hosts.

mbacarella commented 2 years ago

According to this GCE doc, we would need to add support for the controller PCI Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)

Booting Linux on GCE f1-micro, the output of lspci is:

00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 03)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:03.0 Non-VGA unclassified device: Red Hat, Inc Virtio SCSI
00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
00:05.0 Unclassified device [00ff]: Red Hat, Inc Virtio RNG

Which confirms the document.

So, perhaps it is not too hard to add some code that detects that 82371AB device and installs support for ACPI power events.

hannesm commented 2 years ago

This sounds like a worthwhile feature request if anyone is interested in contributing such a driver, the current virtio drivers (net/block) are in bindings/virtio. The underlying design question is whether such a power-off event should be forwarded to the unikernel (thus, extending the solo5 API - and figuring out whether other bindings & tenders provide a similar interface) for potentially cleanly shutting down, or calling solo5_exit directly. Certainly calling solo5_exit could be done for a start.

mbacarella commented 2 years ago

I've looked into this a bit.

This may not be too daunting an undertaking given that the ACPICA code now exists and is meant to be copy/pasted right into OS kernels.

https://wiki.osdev.org/ACPICA

Allowing the unikernel to initiate the poweroff when it exits sounds like a good first step, with a stretch goal of having the platform message to the unikernel in some way that a poweroff was requested.

ACPI is controversial, however. It's technically a turing complete byte-code jnterpreter that has been criticized as a security vulnerability due to the opacity and complexity. Though it seems like Linux and FreeBSD got over it since they implement ACPI now. Maybe not something unikernel users would appreciate however, as the whole point of them is to omit all of the bloated bug/attack surface.

Actually, does this even require being part of Solo5? Perhaps ACPI support for PM would work better as a Mirage library, so that the unikernel user can opt-into the feature if they want it?

On Sun, Nov 14, 2021, 04:38 Hannes Mehnert @.***> wrote:

This sounds like a worthwhile feature request if anyone is interested in contributing such a driver, the current virtio drivers (net/block) are in bindings/virtio. The underlying design question is whether such a power-off event should be forwarded to the unikernel (thus, extending the solo5 API - and figuring out whether other bindings & tenders provide a similar interface) for potentially cleanly shutting down, or calling solo5_exit directly. Certainly calling solo5_exit could be done for a start.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Solo5/solo5/issues/499#issuecomment-968283248, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABA65ZH42AM5IS7P4XCM7DUL6UV5ANCNFSM5H5SY2MA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

hannesm commented 2 years ago

better as a Mirage library

Sure, althought we'd need to pass through the device / memory address than.

Currently I'm thinking: couldn't ACPI power-off for the Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) (scoped very tightly) be very simple -- i.e. looking for a single io-port? It would be great to not have a full-blown ACPI implementation as dependency.

mbacarella commented 2 years ago

It seems simple enough according to this snippet in the Linux kernel.

https://github.com/torvalds/linux/blob/master/drivers/power/reset/piix4-poweroff.c

Assuming that really is how to power off on a system with a detected PIIX4, it should be the way to go.

I attempted to adapt this code but got stuck figuring out how to get the "IO base address". It requires modifying Solo5 pci_enumerate to probe the PCI bus for "functions" as well as bus and dev numbers to detect. That's an easy change and the device does show up, but I got lost in the weeds afterwards trying to figure out how to actually address it.

Probably someone more familiar with x86 architecture could make short work of it.

On Wed, Nov 17, 2021, 05:38 Hannes Mehnert @.***> wrote:

better as a Mirage library

Sure, althought we'd need to pass through the device / memory address than.

Currently I'm thinking: couldn't ACPI power-off for the Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) (scoped very tightly) be very simple -- i.e. looking for a single io-port? It would be great to not have a full-blown ACPI implementation as dependency.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Solo5/solo5/issues/499#issuecomment-971591385, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABA652YOP5CTNMHYTWFNKTUMOV53ANCNFSM5H5SY2MA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Kensan commented 2 years ago

I would strongly suggest to find a simpler way than integrating (parts of) ACPI.

According to the section Stopping and starting a VM of the GCE documentation, when one initiates a VM shutdown via the management interface and ACPI shutdown is raised in the VM. This lets me conclude that Linux is supposed to perform an orderly shutdown, which ultimately leads to this part in the Linux kernel. You could try the BOOT_KBD variant which is basically a few I/O port access to 0x64.

mbacarella commented 2 years ago

I would strongly suggest to find a simpler way than integrating (parts of) ACPI.

According to the section Stopping and starting a VM of the GCE documentation, when one initiates a VM shutdown via the management interface and ACPI shutdown is raised in the VM. This lets me conclude that Linux is supposed to perform an orderly shutdown, which ultimately leads to this part in the Linux kernel. You could try the BOOT_KBD variant which is basically a few I/O port access to 0x64.

I don't know enough to comment on the validity of this approach, but I would like to note here that the code you linked will actually call into the pm_power_off function defined in piix4-poweroff.c, assuming the device was detected and configured. (It calls pm_power_off from native_machine_power_off).

I posted my attempt to adapt that in #500

Along a separate vein, there's also a very hacky way to attempt an ACPI shut down without implementing an ACPI interpreter, as described here. https://forum.osdev.org/viewtopic.php?t=16990 It's possible that hack will work on GCE.

I tried adapting that code but it raises too many pointer size casts and pointer integer casts warnings-as-errors to compile. A naive attempt to fix them partially succeeds in finding the ACPI RSDP header but fails to find the FACP header, because it segfaults instead. I've posted that at #501

kit-ty-kate commented 1 year ago

I've noticed this issue as well and it's fairly annoying. Is there a working workaround by any chance? I wasn't able to understand the conclusion of the discussion above