Closed Pavani-Panakanti closed 1 month ago
The logic and the code behind lock looks good to me.
Another comment. Looking at the code itself, I do not see ANNOTATE_POD_IP
is to true is leading to the race, but irrespective the lock to avoid race seems alright.
We have to run the cyclonus test to ensure that we aren't introducing any regression with this change.
Thanks for the review @orsenthil. I added logs in the description. In logs before the change we can see the race condition where attach probes is happening twice. In logs after change, we can see the lock being taken by one thread and second one is waiting on the lock till first one completes
Regarding ANNOTATE_POD_IP, I added some details in the description. In strict mode IPAMD calls network agent to attach probes. NPA attaches the probes and returns success to IPAMD and IPAMD will mark pod as ready and running. The pod will be discovered by np controller only after pod enters ready state
When ANNOTATE_POD_IP=true, pod will be discovered by NP controller even before pod enters ready state. Cx usually set this to true for fast NP reconciliation. So in this case we are hitting the race condition. NPA from IPAMD call is in the process of attaching the ebpfprobes. Now NP controller reconcile request comes, it will check if probes are already attached. NP agent in other thread is in the process of attaching the probes and did not complete updating all internal datastructures. When reconcile request thread checks if probes are attached -> result is no and it will try to attach the probes. Both go routines are now happening parallely and end result is internal data structures not being updated correctly
Completed running cyclonus tests on the change. All tests passed
Issue #, if available:
When ANNOTATE_POD_IP = false and NETWORK_POLICY_ENFORCING_MODE=strict --> all works fine
ipamd will send a request to NPA to attach probes to the new pod. NP agent returns success after attaching the probes. Then ipamd will mark pod as ready and running. Once the pod is ready network policy controller will discover the pod and add it to the policy endpoints. When NPA reconciles the policy endpoints, it notices the probes are already attached and just updates the maps accordingly
When ANNOTATE_POD_IP = true and NETWORK_POLICY_ENFORCING_MODE=strict --> issue
With ANNOTATE_POD_IP = true, pod can be discovered and added to policy endpoint by controller even before pod is ready and running. So in this case, we are hitting a race condition where we are trying to attach probes to pod from two different go routines. One which started from ipamd request and other which started from reconcile policy endpoint request. Both routines are trying to attach the probes at same time and as a result the internal data structures are not being updated correctly. Due to this network policies are not being applied to the right map ids
Description of changes:
Have a lock per pod which will be used while attaching the probes to the pod. While attaching probes, take the lock and attach the probes and release the lock after updating all the data structures. In this case even if two go routines come at the same time, one will wait for the lock and start only after the first one has completed attaching the probes. So when second one checks if probes are already attached, it will be true and does nothing
Simulated the race condition with some sleeps. Logs before the change
Logs after the change
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.