aws / aws-network-policy-agent

Apache License 2.0
45 stars 29 forks source link

Fix race condition in strict mode #306

Closed Pavani-Panakanti closed 1 month ago

Pavani-Panakanti commented 2 months ago

Issue #, if available: Fixing a race condition where shared ebpf maps are being deleted in strict mode

Description of changes: Added new map to store progFD to pods mapping which will help us check if progFD's are being shared between pods

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

achevuru commented 1 month ago

Generic Q: How is the BPF prog delete going through when the said prog is still (loaded &) attached to another active pod interface on the same node?

Pavani-Panakanti commented 1 month ago

Generic Q: How is the BPF prog delete going through when the said prog is still (loaded &) attached to another active pod interface on the same node?

We are assuming it is not being shared by any other active pod by verifying policy endpoint list. But when a new pod is created and not yet reconciled in strict mode, we are attaching the probes and updating the maps but pod is not yet present in policy endpoint list https://github.com/aws/aws-network-policy-agent/blob/main/pkg/rpc/rpc_handler.go#L74

{"level":"info","ts":"2024-09-17T16:14:51.158Z","caller":"ebpf/bpf_client.go:724","msg":"This can be deleted, not needed anymore..."}
{"level":"info","ts":"2024-09-17T16:14:51.158Z","caller":"maps/loader.go:505","msg":"Delete map entry done with fd : 0 and err errno 0"}
jayanthvn commented 1 month ago

Nice to hear from you @achevuru :)

To summarize with Strict mode when we get pod 2 of the same replica while we have pod 1 on the node. We check there is pod 1 and reuse the FDs and update the local data structures i.e, pod -> FDs mapping. Now the issue happens if there is a pod 1 delete at the same time and the PE reconcile is out of order i.e, we get pod 1 delete in the first reconciliation loop and then pod 2 add in the second reconciliation loop. In the first reconciliation loop we don't do additional check (shared replicas) and mark the pod 1 and bpf file to be deleted. Now when second reconciliation is received we just check the local data structure i.e, the mapping and skip loading bpf prog/map..pod 2 ENI will have a probe with an invalid FD..

Pavani-Panakanti commented 1 month ago

We can do the cleanup post merge. We should also have a way to dump all these internal structures maybe via cli to debug any issues..

Agree. Logging these structs will help us in debugging. I will add a task and follow up on this