cilium / cilium

eBPF-based Networking, Security, and Observability
https://cilium.io
Apache License 2.0
19.83k stars 2.91k forks source link

Cosmetic: BGPv1 Control Plane accepts BGP Paths that have no effect #32826

Closed dswaffordcw closed 3 months ago

dswaffordcw commented 3 months ago

Is there an existing issue for this?

What happened?

This is a cosmetic bug report, for the most part.

Cilium's BGPv1 Control Plane accepts all BGP Paths advertised to it. However, these paths do not program the datapath. It is my understanding that they are simply ignored.

This leads me to ask -- why does Cilium accept them?

While troubleshooting Cilium's BGP stack using cilium bgp routes, I find it confusing to see any received routes/Paths:

$  cilium bgp routes --node bgp-cplane-dev-v4-worker
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node                       VRouter   Prefix             NextHop    Age         Attrs
bgp-cplane-dev-v4-worker   65002     100.100.100.0/24   10.0.2.1   2m6s        [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     100.100.110.0/24   10.0.2.1   19s         [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     100.100.120.0/24   10.0.2.1   18s         [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     100.100.130.0/24   10.0.2.1   16s         [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     100.100.140.0/24   10.0.2.1   14s         [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     100.64.12.1/32     0.0.0.0    173h9m55s   [{Origin: i} {Nexthop: 0.0.0.0}]
                           65002     200.200.200.0/24   10.0.2.1   2m0s        [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     200.200.210.0/24   10.0.2.1   9s          [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     200.200.220.0/24   10.0.2.1   7s          [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     200.200.230.0/24   10.0.2.1   5s          [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]
                           65002     200.200.240.0/24   10.0.2.1   4s          [{Origin: i} {AsPath: 65000} {Nexthop: 10.0.2.1} {Med: 0}]

For this example, I am running Cilium's BGP development setup (make kind-bgp-v4) with the changes below applied to the FRR container:

bash-5.1# ip route add 100.100.100.0/24 dev lo
bash-5.1# ip route add 100.100.110.0/24 dev lo
bash-5.1# ip route add 100.100.120.0/24 dev lo
bash-5.1# ip route add 100.100.130.0/24 dev lo
bash-5.1# ip route add 100.100.140.0/24 dev lo
bash-5.1# ip route add 200.200.200.0/24 dev lo
bash-5.1# ip route add 200.200.210.0/24 dev lo
bash-5.1# ip route add 200.200.220.0/24 dev lo
bash-5.1# ip route add 200.200.230.0/24 dev lo
bash-5.1# ip route add 200.200.240.0/24 dev lo

vtysh
router0# conf t
router0(config)# router bgp 65000
router0(config-router)# network 100.100.100.0/24
router0(config-router)# network 100.100.110.0/24
router0(config-router)# network 100.100.120.0/24
router0(config-router)# network 100.100.130.0/24
router0(config-router)# network 100.100.140.0/24
router0(config-router)# network 200.200.200.0/24
router0(config-router)# network 200.200.210.0/24
router0(config-router)# network 200.200.220.0/24
router0(config-router)# network 200.200.230.0/24
router0(config-router)# network 200.200.240.0/24

Cilium Version

cilium-cli: v0.16.7-40-g9316d0ac compiled with go1.22.2 on linux/amd64 cilium image (default): v1.15.5 cilium image (stable): v1.15.5 cilium image (running): 1.16.0-dev

Kernel Version

$ uname -a Linux hostname 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.15-enhanced-describe-dirty", GitCommit:"ac2e2baa7d4039cc4c68f2e869e4edbe2d60b305", GitTreeState:"dirty", BuildDate:"2023-03-02T00:33:46Z", GoVersion:"go1.20.1", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.0", GitCommit:"7c48c2bd72b9bf5c44d21d7338cc7bea77d0ad2a", GitTreeState:"clean", BuildDate:"2024-05-13T22:00:36Z", GoVersion:"go1.22.2", Compiler:"gc", Platform:"linux/amd64"}

Regression

No

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Cilium Users Document

Code of Conduct

dswaffordcw commented 3 months ago

Note, I say "Cosmetic for the most part", because I've seen GoBGP be unstable when receiving upwards of 1 million routes (full Internet tables). I know that's not an expected use case, but by rejecting all received Paths, we reduce that potential failure point.

dswaffordcw commented 3 months ago

@harsimran-pabla I'd like to put a PR for this. I haven't started though. Let me know if you think it's a good change to make.

harsimran-pabla commented 3 months ago

@harsimran-pabla I'd like to put a PR for this. I haven't started though. Let me know if you think it's a good change to make.

Yes, please go ahead if you have a clear idea about the implementation.