Open markdryan opened 7 years ago
@tpepper Is this something that the scheduler already does? If not, would it be much work to do?
Currently the scheduler does not track this. If it needs tracked, it wont be hard. I would see a network node's network segment as yet another trackable resource that its launcher would report and we can do scheduling based on it if a workload start request includes a request.
Do we support multiple network segments today?
Do we support multiple network segments today?
Not sure. @mcastelino Any idea?
@markdryan @tpepper Yes we do support multiple network segments. That is the reason why the configuration of Ciao support multiple "Compute Networks". The networking layer will scan the machine and attach to the first one it sees.
So we do not support a single machine connected to multiple active network segments that serve the same function (management/compute). We just pick the first one. https://github.com/01org/ciao/blob/master/networking/libsnnet/cn.go#L190
Here cn.ComputeAddr & cn.ComputeLink will provide the details of the segment you attached to.
On the same lines we have https://github.com/01org/ciao/blob/master/networking/libsnnet/cnci.go#L96
In case of the CNCI it will always be on the same Network Compute segment as the NN (due to our use of macvtap).
@markdryan @tpepper To elaborate a little more on what I mean by "we do not support multiple active segments". When we create tunnels we pick the IP of the first Compute Segment we see as the Tunnel Src IP. https://github.com/01org/ciao/blob/master/networking/libsnnet/cn.go#L747 Hence if the machine has multiple active segments, unless both sides of the tunnel are re-setup the migration will not succeed.
@mcastelino I think from Mark's initial description in the issue "Compute Net" refers to the cluster configured "compute_net". That and "mgmt_net" are configurables under "launcher", as per the example in https://github.com/01org/ciao/blob/master/configuration/README.md. I believe today there is only one compute_net for the cluster and each compute and network node is required to be on it.
@tpepper As specified in the spec
compute_net: list [The launcher compute network(s)]
is a list of compute networks.
The compute network the CN or NN is on (of the list of possible valid networks as specified in the configuration) is reported back to the launcher. I do not know if that information is carried back to the scheduler and I assume can be sent to the controller as part of the stats message https://github.com/01org/ciao/blob/master/payloads/stats.go#L100
I never knew that was a list. As-is today scheduler / SSNTP-server doesn't care as any of the frames dealing with this field are simply passed through. I will extend the scheduler to record and track it and enable cnci placement to a correct node if a net is requested in the start frame.
The scheduler need to ensure that the CNCI is rescheduled to an NN in the same network segment. This can be done by checking the Compute Net IP of the Physical Node. It should choose a NN which has an IP that is on the same Compute Subnet