cilium / cilium-cli

CLI to install, manage & troubleshoot Kubernetes clusters running Cilium
https://cilium.io
Apache License 2.0
417 stars 210 forks source link

connectivity: Introduce BGP CP connectivity tests #2649

Closed rastislavs closed 2 months ago

rastislavs commented 3 months ago

Introduces BGP Control Plane connectivity tests, working with FRR router instance running in the host network namespace on the node-without-cilium.

The aim of this PR is mostly to introduce the BGP + FRR testing infrastructure, more testing scenarios need to be added in follow-up PRs. At the moment, we are testing:

Example run with debug enabled: ``` [=] [cilium-test] Test [bgp-control-plane-v1] [80/84] [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-pod-ipv4-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-same-node-7f896b84-sbqv4 (10.244.1.188:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.244.1.188:8080] . [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-pod-ipv4-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-other-node-58999bbffd-qsrq8 (10.244.0.254:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.244.0.254:8080] [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-service-ipv4-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-same-node (10.96.136.144:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.96.136.144:8080] .. [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-service-ipv4-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-other-node (10.96.194.250:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.96.194.250:8080] . [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-pod-ipv6-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-other-node-58999bbffd-qsrq8 (fd00:10:244::9227:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:244::9227]:8080] . [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-pod-ipv6-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-same-node-7f896b84-sbqv4 (fd00:10:244:1::1a8:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:244:1::1a8]:8080] [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-service-ipv6-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-same-node (fd00:10:96::4567:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:96::4567]:8080] .. [.] Action [bgp-control-plane-v1/bgpv1-advertisements/curl-echo-service-ipv6-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-other-node (fd00:10:96::4763:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:96::4763]:8080] . 🐛 Finalizing Test bgp-control-plane-v1 [-] Scenario [bgp-control-plane-v2/bgpv2-advertisements] [=] [cilium-test] Test [bgp-control-plane-v2] [81/84] [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-pod-ipv4-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-other-node-58999bbffd-qsrq8 (10.244.0.254:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.244.0.254:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-pod-ipv4-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-same-node-7f896b84-sbqv4 (10.244.1.188:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.244.1.188:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-service-ipv4-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-other-node (10.96.194.250:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.96.194.250:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-service-ipv4-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (172.22.0.4) -> cilium-test/echo-same-node (10.96.136.144:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://10.96.136.144:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-pod-ipv6-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-other-node-58999bbffd-qsrq8 (fd00:10:244::9227:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:244::9227]:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-pod-ipv6-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-same-node-7f896b84-sbqv4 (fd00:10:244:1::1a8:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:244:1::1a8]:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-service-ipv6-0: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-same-node (fd00:10:96::4567:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:96::4567]:8080] . [.] Action [bgp-control-plane-v2/bgpv2-advertisements/curl-echo-service-ipv6-1: cilium-test/echo-external-node-89864b5bd-4lzt4 (fc00:c111::4) -> cilium-test/echo-other-node (fd00:10:96::4763:8080)] 🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 http://[fd00:10:96::4763]:8080] . 🐛 Finalizing Test bgp-control-plane-v2 ```

Example e2e job: https://github.com/cilium/cilium/actions/runs/9778585998/job/26995747615

rastislavs commented 2 months ago

@YutaroHayakawa

Is it possible to collect the FRR's running state on test failure? Otherwise, it's a bit hard to investigate the failure.

I added DumpFRRBGPState function that will dump BGP state into the log in case that the test failed.