Open chdxD1 opened 4 months ago
I assume something like this will be needed
healthcheck
package I think).Other ideas are:
network-operator-master
pod, leader election could be used to elect leader that will perform role of central pod. The problem would be how to revert changes on previously configured pod, if leader will configure itself with invalid config.@p-strusiewiczsurmacki-mobica I have a few comments on the steps:
VRFs and L2VNIs are currently not rolled out on controlplane nodes, so it might make sense. However it might be meaningful to have a separate set of controllers (with integrated leader election) that work as "control-plane" for the configuration
I am not so sure about the gRPC endpoint. There are multiple ways to do it of course, one is gRPC the other would be decoupling it by using custom resources for the individual node configuration as well. The pod running on the node would look for changes of its local NodeConfiguration and will report the rollout status in the status fields. If a node disappears (is deleted from the cluster the controller should also cleanup leftover NodeConfigurations) The config could look like this:
apiVersion: network.schiff.telekom.de/v1alpha1
kind: NodeConfiguration
metadata:
name: <node-name>
spec:
vrfs: {}
l2vnis: {}
[...]
The network operator can save the working configuration on the local disk
yes, mostly implemented there, might be a good idea to check for e.g. API server as well
yes
see remarks regarding status field in 2.
yes
yes
yes
Regarding the other ideas:
@chdxD1 Just 2 more questions before I'll try to incorporate your comments into this:
kind: NodeConfiguration
resource be created by the user, or should it be created by the controller out of the vrfrouteconfigurations
, layer2networkconfigurations
and routingtables
resources? I assume the latter, but just want to be sure.NodeConfiguration
will be created by the controller - let's say user deploys vrfrouteconfigurations
with selector that selects e.g. 3 nodes. Controller creates NodeConfiguration
for all 3 nodes - 2 of those nodes report success, but one fails. Should we revert the changes on the successful nodes?EDIT: I've made this diagram quickly:
network-operator-config-controller
pods watch vrfrouteconfigurations
, layer2networkconfigurations
and routingtables
resources.NodeConfiguration
for first node worker-1
.worker
watches NodeConfiguration
object.
4a. It gets new config if available.
4b. After configuring the node it sets status
of NodeConfiguration
object to valid
or invalid
.NodeConfiguration
object.status
of NodeConfiguration
object is set, it means that node was processed, so next NodeConfiguration
object (for worker-2
) is created.NodeConfiguration
objects if necessary (e.g. node was removed from the cluster).@chdxD1 I've started working on this on Monday. I've created draft PR so you can take a quick look if you'll have some spare time and tell me if I am going in good direction with this.
It is implemented mostly as described in my previous comment. Right now I need to make network-operator-config-controller
to wait before it will deploy next config (so, basically to do that gradually), and also some checks to be sure that the NodeConfig
was really changed and should be deployed at all.
I have additional question as well - what should we do with the mounted config file? Should it stay the way it is now, or should it also be a part of NodeConfig?
Regarding the mounted config file: It might become a CRD instead of a configmap however it will be the same for all nodes.
OK, I think I've got most of it by now.
Before updating config node configuration controller gets all currently deployed configs and if any node fails, it reverts changes on all the nodes. Currently it just sotre that in memory, but I am thinking if those should not be saved as k8s objects in case of leader change amidst the config update, so new leader can revert he changes if required).
I have just one more issue - controller creates NodeConfig
using vrfrouteconfigurations
, layer2networkconfigurations
and routingtables
. Now, if NodeConfig is invalid it will get reverted to the last known working config, but controller will still try to use existing vrfrouteconfigurations
, layer2networkconfigurations
and routingtables
which will result in invalid config being created and deployed over and over again.
The only workaround I can think of is to store last invalid configs as an objects named e.g. <nodename>-invalid
(as I currently store current configs named as <nodename>
) and then check new config against it and if new config results in one or more invalid configs, then cease its deployment. Eventually those configs could be stored in new CRD named InvalidNodeConfig
for better clarity.
It's not really clean but it should work, I think. What's your opinion?
In the current design network-operator runs independently from each other. Custom resources are read from the cluster and configured on the local node.
If a user configures a faulty configuration this could render all nodes inoperable at roughly the same time. There is no rollback mechanism in place that reverts faulty configuration or a gradual rollout of configuration.
I propose a 2-step approach:
.status
part. The central controller renders each node configuration individually (and in a gradual fashion) thus allowing to respond to failed node configurations by stopping any further rollout until the input resources are changed again.Because connection to the API server might be impacted by a faulty rollout, step two depends on step one. Each network-operator on the node should be individually capable to rollback the configuration to a previous, working state.
For step two: The controller (or K8s API server with ownership relations) should also clean up node configurations (which can be written as a dedicated CRD) when a node leaves the cluster. As we are a heavy user of Cluster API this is required to not clog the API server with unnecessary resources.