Azure / azure-container-networking

Azure Container Networking Solutions for Linux and Windows Containers
MIT License
378 stars 241 forks source link

[backport] fix: [NPM] [Linux] improve iptables version detection and cleanup #3110

Closed huntergregory closed 1 week ago

huntergregory commented 1 week ago

Backport #3090 and add a fix per cd4cfcf7416b184b56fa7ca1aa374d5b719d1044

cd4cfcf7416b184b56fa7ca1aa374d5b719d1044 causes NPM to crash if it fails to detect which iptables version kube-proxy is using (whether it fails since the kube chains don't exist or for any other reason).

Without cd4cfcf7416b184b56fa7ca1aa374d5b719d1044, #3090 would introduce an issue where NPM could use nft when it should use legacy (if iptables -nL failed for whatever reason, or if kube-proxy somehow hadn't installed its chains yet).

huntergregory commented 1 week ago

/azp run Azure Container Networking PR

azure-pipelines[bot] commented 1 week ago
Azure Pipelines successfully started running 1 pipeline(s).
huntergregory commented 1 week ago

Manual test of crash logic:

I1107 02:53:59.943767       1 chain-management_linux.go:253] first attempt detecting iptables version. looking for hint/canary chain in iptables-nft
I1107 02:53:59.943774       1 chain-management_linux.go:523] executing iptables command [iptables-nft] with args [-w 60 -L FAKE-KUBE-IPTABLES-HINT -t mangle -n]
I1107 02:53:59.946813       1 chain-management_linux.go:523] executing iptables command [iptables-nft] with args [-w 60 -L FAKE-KUBE-KUBELET-CANARY -t mangle -n]
2024/11/07 02:53:59 [1] error: There was an error running command: [iptables-nft -w 60 -L FAKE-KUBE-IPTABLES-HINT -t mangle -n] Stderr: [exit status 1, # Warning: iptables-legacy tables present, use iptables-legacy to see them
iptables: No chain/target/match by that name.]
2024/11/07 02:53:59 [1] error: There was an error running command: [iptables-nft -w 60 -L FAKE-KUBE-KUBELET-CANARY -t mangle -n] Stderr: [exit status 1, # Warning: iptables-legacy tables present, use iptables-legacy to see them
iptables: No chain/target/match by that name.]
I1107 02:53:59.948626       1 chain-management_linux.go:259] second attempt detecting iptables version. looking for hint/canary chain in iptables-legacy
I1107 02:53:59.948632       1 chain-management_linux.go:523] executing iptables command [iptables] with args [-w 60 -L FAKE-KUBE-IPTABLES-HINT -t mangle -n]
2024/11/07 02:53:59 [1] error: There was an error running command: [iptables -w 60 -L FAKE-KUBE-IPTABLES-HINT -t mangle -n] Stderr: [exit status 1, iptables: No chain/target/match by that name.]
I1107 02:53:59.952128       1 chain-management_linux.go:523] executing iptables command [iptables] with args [-w 60 -L FAKE-KUBE-KUBELET-CANARY -t mangle -n]
E1107 02:53:59.955719       1 dataplane.go:118] Failed to reset dataplane: Operation [BootupDataplane] failed with error code [999], full cmd [], full error failed to reset policy dataplane: Operation [BootupPolicyManager] failed with error code [999], full cmd [], full error failed to bootup policy manager: failed to detect iptables version: unable to locate which iptables version kube proxy is using
Error: failed to create dataplane with error Operation [BootupDataplane] failed with error code [999], full cmd [], full error failed to reset policy dataplane: Operation [BootupPolicyManager] failed with error code [999], full cmd [], full error failed to bootup policy manager: failed to detect iptables version: unable to locate which iptables version kube proxy is using
Usage:
  azure-npm start [flags]

Flags:
  -h, --help                help for start
      --kubeconfig string   path to kubeconfig

Error: failed to create dataplane with error Operation [BootupDataplane] failed with error code [999], full cmd [], full error failed to reset policy dataplane: Operation [BootupPolicyManager] failed with error code [999], full cmd [], full error failed to bootup policy manager: failed to detect iptables version: unable to locate which iptables version kube proxy is using
2024/11/07 02:53:59 [1] error: There was an error running command: [iptables -w 60 -L FAKE-KUBE-KUBELET-CANARY -t mangle -n] Stderr: [exit status 1, iptables: No chain/target/match by that name.]
2024/11/07 02:53:59 [1] error: failed to bootup policy manager: failed to detect iptables version: unable to locate which iptables version kube proxy is using
2024/11/07 02:53:59 [1] error: failed to create dataplane with error Operation [BootupDataplane] failed with error code [999], full cmd [], full error failed to reset policy dataplane: Operation [BootupPolicyManager] failed with error code [999], full cmd [], full error failed to bootup policy manager: failed to detect iptables version: unable to locate which iptables version kube proxy is using
huntergregory commented 1 week ago

Other detection/cleanup logic works for this NPM image built on release/v1.5 branch