ciao-project / ciao

Ciao - Cloud Integrated Advanced Orchestrator
Apache License 2.0
214 stars 51 forks source link

Networking: Adding support for SRIOV #1637

Open mcastelino opened 6 years ago

mcastelino commented 6 years ago

This issue attempts to capture the requirements, issues and considerations for support SRIOV based networking in ciao

Current Status

ciao networking today is based on point to multi-point GRE tunnels connected to CNCIs. In this networking model each tenant get a single overlay network (across a mesh of CNCIs). This single network spans containers as well as VM's and all the networking properties are pre-determined at the time the instance is created. This includes hostname, IP address, subnet, routing. The only property that can be determined post instance creation is the external IP address assigned to the instance.

Adding SRIOV support

Adding SRIOV support from the point of view of libsnet is simple. Both the docker network plugin and the VNIC creation logic can be enhanced to support adding an additional SRIOV network interface to a container or a VM.

However there are two issues.

Secondary network support

Today the controller can only manage a single network per tenant. The controller needs to be enhanced to support multiple networks, and to allow only some instances to have secondary networks

ipam

The second issue is that, in the case of SRIOV networks, the IPAM is normally DHCP. Which means that the IPAM is effectively delegated to the underlay. Which also means that the IP Address is determined at launch time. Unless the controller can control the DHCP server the IP Address of the SRIOV interface will only be determined after instance launch.

In the case of VM's it may not be possible for the ciao launcher to determine the IP address that got assigned to the VM (or if it even got assigned).

In the case of containers, the launcher can determine the IP address once the IPAM plugin has run and propagate the values back.

Migration

Ciao support seamless migration of both containers and VM across hosts. For this property to be preserved in this case, the DHCP server need to be able to preserve the mac address to IP address binding across migrations.

Potential solutions

The CNCI today runs a DHCP and DNS server on behalf of the tenant network. It is possible that the CNCI be connected to the SRIOV network, provided each tenant SRIOV network is placed in its own VLAN. This means that ciao can continue to manage the secondary SRIOV network.

If the SRIOV network spans multiple tenants, then we can consider creating a CNCI just to manage DHCP and DNS for the entire cluster.

mcastelino commented 6 years ago

/cc @markdryan

jdandrea commented 6 years ago

Thanks for starting this thread!

Regarding multiple networks, I wonder if Multus would help in that regard? Or would there need to be a functional equivalent of Multus that works in CIAO-space?