Closed ymartin-ovh closed 10 months ago
@ymartin-ovh , cali
interfaces are our virtual interfaces that we create to manage the networking for a container. I imagine that you are seeing some time for them to come up for an init container because init containers are created then removed so we should be creating a cali
interface for it so that it has connectivity, and then removing it when the init container is removed.
On our cluster deployment, we use calico network policies to manage network flows. Can you tell me that all network policies must be activated / enabled on new pod deployment in order to UP the network interface ?
I don't think that network policies should be controlling the status of the network interfaces. Is there a specific reason that you've come to that conclusion? I would have thought that if we are looking at an init container, we would see some log messages like this since the interface should be removed when the container dies. @caseydavenport anything else you think we should check?
because init containers are created then removed so we should be creating a cali interface for it so that it has connectivity, and then removing it when the init container is removed.
One thing I'd say - the interface shouldn't be removed after the init container finishes. The netns is shared between the init containers and other containers.
The Calico CNI plugin doesn't set the new interfacte to oper UP until it has completed its configuration of it. This is expected, and the logs you're seeing indicate normal operation of Felix, who will see the interface immediately before the CNI plugin has finished configuring it (you can see they are INFO and DEBUG level and not WARN or ERROR).
Can you tell me that all network policies must be activated / enabled on new pod deployment in order to UP the network interface ?
To answer this specifically - no, we don't wait for all network policies to be enabled before setting oper UP. However, new interfaces will automatically receive no network connectivity until policies are applied for other reasons. Namely because the default rules drop traffic until Felix finishes its programming.
@caseydavenport / @mgleung
Thanks for the feedback and details you gave me. I asked about interface state and network policy dependency because in my logs between 12:30:14 and 12:30:17, I see a lot of logs like:
[DEBUG][62] felix/rules.go 169: Hashed rule action=...
For now, I don't understand why interface that should be UP in few first line on logs is still in down state:
2023-09-29 12:30:13.074 [DEBUG][62] felix/iface_monitor.go 350: Interface changed state ifIndex=181359 ifaceName="cali89cc739dd58" newState="up" oldState="down"
2023-09-29 12:30:14.530 [INFO][62] felix/endpoint_mgr.go 1283: Skipping configuration of interface because it is oper down. ifaceName="cali89cc739dd58"
logs about interface: cali89cc739dd58.log
[DEBUG][62] felix/rules.go 169: Hashed rule action=...
These logs just indicate the rules being programmed by Felix. They are at DEBUG level.
Felix is waiting until the CNI plugin marks the interface as UP before programming it.
@ymartin-ovh Closing this issue. Feel free to reach out if you have further questions.
I'm facing sometimes connection issues on init containers. Looking traces I've got, I don't understand why cali* interfaces take times to be UP.
Logs filtered on cali89cc739dd58:
Those lines puzzle me:
On our cluster deployment, we use calico network policies to manage network flows. Can you tell me that all network policies must be activated / enabled on new pod deployment in order to UP the network interface ?
Your Environment