Describe the bug
Daemon fails to load or more likely didnt unload XDP program at some stage.
To Reproduce
Steps to reproduce the behaviour:
Unknown so far. Produced it with e2e tests in #173 but couldn't replicate it.
Opened this issue to track my investigation.
Logs - but i failed to get previous logs when it occurred :(
1.6655759345163863e+09 INFO setup Version {"version.Version": "361d7226-dirty"}
I1012 11:58:55.567113 172749 request.go:682] Waited for 1.039010097s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/k8s.cni.cncf.io/v1?timeout=32s
1.6655759368196218e+09 INFO controller-runtime.metrics Metrics server is starting to listen {"addr": "127.0.0.1:39301"}
1.665575936819802e+09 INFO setup starting manager
1.6655759368200016e+09 INFO Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:39301"}
1.665575936820043e+09 INFO Starting server {"kind": "health probe", "addr": "127.0.0.1:39300"}
1.6655759368201237e+09 INFO Starting EventSource {"controller": "ingressnodefirewallnodestate", "controllerGroup": "ingressnodefirewall.openshift.io", "controllerKind": "IngressNodeFirewallNodeState", "source": "kind source: *v1alpha1.IngressNodeFirewallNodeState"}
1.6655759368201387e+09 INFO Starting Controller {"controller": "ingressnodefirewallnodestate", "controllerGroup": "ingressnodefirewall.openshift.io", "controllerKind": "IngressNodeFirewallNodeState"}
1.665575936921046e+09 INFO Starting workers {"controller": "ingressnodefirewallnodestate", "controllerGroup": "ingressnodefirewall.openshift.io", "controllerKind": "IngressNodeFirewallNodeState", "worker count": 1}
1.6655759568332734e+09 INFO controllers.IngressNodeFirewall Reconciling resource and programming bpf {"name": "worker-0.ostest.test.metalkube.org", "namespace": "openshift-ingress-node-firewall"}
1.6655759568333015e+09 INFO controllers.IngressNodeFirewall.syncIngressNodeFirewallResources Running sync operation {"ifaceIngressRules": {"genev_sys_6081":[{"sourceCIDRs":["10.129.2.43/32"],"rules":[{"order":1,"protocolConfig":{"protocol":"TCP","tcp":{"ports":"80"}},"action":"Deny"},{"order":2,"protocolConfig":{"protocol":"UDP","udp":{"ports":"80"}},"action":"Deny"}]},{"sourceCIDRs":["fd01:0:0:6::2b/128"],"rules":[{"order":1,"protocolConfig":{"protocol":"TCP","tcp":{"ports":"80"}},"action":"Deny"},{"order":2,"protocolConfig":{"protocol":"UDP","udp":{"ports":"80"}},"action":"Deny"}]}]}, "isDelete": false}
1.6655759568334153e+09 INFO controllers.IngressNodeFirewall Creating a new eBPF firewall node controller
I1012 11:59:16.879072 172749 ingress_node_firewall_loader.go:327] Loading interfaces from pinned dir into memory
2022/10/12 11:59:16 Listening for events..
1.6655759568793685e+09 INFO controllers.IngressNodeFirewall Comparing currently managed interfaces against list of XDP interfaces on system {"e.managedInterfaces": {}}
1.6655759568797479e+09 INFO controllers.IngressNodeFirewall Attaching firewall interface {"intf": "genev_sys_6081"}
1.6655759568798752e+09 ERROR controllers.IngressNodeFirewall Fail to attach ingress firewall prog {"error": "could not attach XDP program: create link: device or resource busy", "errorCauses": [{"error": "could not attach XDP program: create link: device or resource busy"}]}
github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer.(*ebpfSingleton).attachNewInterfaces.func2
/go/src/github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer/ebpfsyncer.go:187
k8s.io/client-go/util/retry.OnError.func1
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/client-go/util/retry/util.go:51
k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:222
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:235
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:228
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:423
k8s.io/client-go/util/retry.OnError
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/client-go/util/retry/util.go:50
github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer.(*ebpfSingleton).attachNewInterfaces
/go/src/github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer/ebpfsyncer.go:179
github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer.(*ebpfSingleton).SyncInterfaceIngressRules
/go/src/github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer/ebpfsyncer.go:102
github.com/openshift/ingress-node-firewall/controllers.(*IngressNodeFirewallNodeStateReconciler).reconcileResource
/go/src/github.com/openshift/ingress-node-firewall/controllers/ingressnodefirewallnodestate_controller.go:94
github.com/openshift/ingress-node-firewall/controllers.(*IngressNodeFirewallNodeStateReconciler).Reconcile
/go/src/github.com/openshift/ingress-node-firewall/controllers/ingressnodefirewallnodestate_controller.go:77
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
1.6655759568910933e+09 INFO controllers.IngressNodeFirewall Attaching firewall interface {"intf": "genev_sys_6081"}
1.6655759568913753e+09 ERROR controllers.IngressNodeFirewall Fail to attach ingress firewall prog {"error": "could not attach XDP program: create link: device or resource busy", "errorCauses": [{"error": "could not attach XDP program: create link: device or resource busy"}]}
github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer.(*ebpfSingleton).attachNewInterfaces.func2
/go/src/github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer/ebpfsyncer.go:187
k8s.io/client-go/util/retry.OnError.func1
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/client-go/util/retry/util.go:51
k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:222
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:235
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:228
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:423
k8s.io/client-go/util/retry.OnError
/go/src/github.com/openshift/ingress-node-firewall/vendor/k8s.io/client-go/util/retry/util.go:50
github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer.(*ebpfSingleton).attachNewInterfaces
/go/src/github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer/ebpfsyncer.go:179
github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer.(*ebpfSingleton).SyncInterfaceIngressRules
/go/src/github.com/openshift/ingress-node-firewall/pkg/ebpfsyncer/ebpfsyncer.go:102
github.com/openshift/ingress-node-firewall/controllers.(*IngressNodeFirewallNodeStateReconciler).reconcileResource
/go/src/github.com/openshift/ingress-node-firewall/controllers/ingressnodefirewallnodestate_controller.go:94
github.com/openshift/ingress-node-firewall/controllers.(*IngressNodeFirewallNodeStateReconciler).Reconcile
/go/src/github.com/openshift/ingress-node-firewall/controllers/ingressnodefirewallnodestate_controller.go:77
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/src/github.com/openshift/ingress-node-firewall/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
.....
Keeps repeating
Describe the bug Daemon fails to load or more likely didnt unload XDP program at some stage.
To Reproduce Steps to reproduce the behaviour: Unknown so far. Produced it with e2e tests in #173 but couldn't replicate it. Opened this issue to track my investigation.
Logs - but i failed to get previous logs when it occurred :(