Open krisnova opened 1 year ago
Is this about the connection between auraed on the host system and auraed on the guest (MicroVM)? There are two options:
With firecracker, an ethernet connection is probably the default connection between the host and a MicroVM. It was also the only option in the beginning, when AWS open sourced firecracker. For an ethernet connection, you need a tap device on a host. Via the Firecracker API this tap device will be connected to a virtio-net device, which is attached to the MicroVM. This virtio-net device will appear as a NIC in the VM.
A socket connection between host and MicroVM was added later on to firecracker. Probably it's not used by AWS in production. This works with virtio-vsock and provides a socket connection where one end resides on the host and the other end is within the VM. There are security concerns regarding this type of connection. I am no expert and I cannot explain what's the problem or possible attack vector. I just wouldn't use it.
So, I would vote for using an ethernet connection between the host and the VM. To connect to the nested auraed we can use IPv6 link-local addressing (fe80:: addresses). Those addresses will not be routed and therefore a connection can only be established locally. We can decide on well-known link local addresses to be assigned to auraed and nested auraed (e.g. fe80::1/64) or use stateless autoconfiguration and monitor the IPv6 neighbor discovery packets to find the addresses.
When you have a DPU that is controlling the host system, you will definitely have an Ethernet connection between the DPU and the host system. Some DPUs also provide an additional proprietary communication channel for control plane communication. But the common thing you will always have, are Ethernet connections.
From a security perspective I think Ethernet connections are fine. Lots of people invest lots of time to harden the Linux kernel's Ethernet stack.
In my experiments with cloud hypervisor I have also been using the TAP/virtio approach over IPv6 link-local.
Cloud-hypervisor has support for vhost-net
as well (FC doesn't) so that is something we could look at as well: https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/device_model.md#vhost-user-net
In my experiments with cloud hypervisor I have also been using the TAP/virtio approach over IPv6 link-local.
Cloud-hypervisor has support for
vhost-net
as well (FC doesn't) so that is something we could look at as well: https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/device_model.md#vhost-user-net
My implementation in #506 uses the default virtio-net
(instead of vhost-user-net
linked above) and it works great over link-local. I'm not as familiar with what will be needed for additional networking requirements, but I believe we'd need some additional configuration to allow workloads within the VM to communicate externally?
Given our recursive nature of hosting nested
auraed
instances in virtual guests, nested cells, and container isolation zones we need to start understanding the connection detail between the host and the guest.This issue suggests we use the following strategy:
By default we use IPv6 between the host and the guest, regardless of what the host is using. We take a strong opinion against v-sock implementation and exposing shared memory or file descriptor access to a guest.