projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
5.97k stars 1.33k forks source link

BGP listen port is not changed. (calico v3.16.3) #4103

Closed suiny closed 3 years ago

suiny commented 4 years ago

This issue is what I asked on slack and is related to #4098 https://calicousers.slack.com/archives/CPTH1KS00/p1602833003052900

  1. Calico has been updated to version 3.16.3
    [root@master ~]# docker images|grep calico
    calico/node                                        v3.16.3             f0d3b0d0e32c        6 days ago          164MB
    calico/pod2daemon-flexvol                          v3.16.3             a0b97353aa18        6 days ago          22.9MB
    calico/cni                                         v3.16.3             fe49caa20c30        6 days ago          133MB
    [root@master ~]# calicoctl version
    Client Version:    v3.16.3
    Git commit:        7d066703
    Cluster Version:   v3.15.2
    Cluster Type:      kubespray,bgp,kubeadm,k8s
  2. And I wanted to apply the BGP port change spec, but it didn't apply even after restarting the callico-node pod.
    [root@master ~]# calicoctl get BGPConfiguration default -o yaml --export
    apiVersion: projectcalico.org/v3
    kind: BGPConfiguration
    metadata:
    creationTimestamp: "2020-09-09T10:08:42Z"
    name: default
    resourceVersion: "11573515"
    uid: 0ef5e0ef-db19-4afa-a906-1a7c95d7aaab
    spec:
    asNumber: 64512
    listenPort: 11178
    logSeverityScreen: Info
    nodeToNodeMeshEnabled: true
    [root@master ~]# ps -ef|grep bird
    root      53074  53022  0 17:36 ?        00:00:00 runsv bird6
    root      53075  53022  0 17:36 ?        00:00:00 runsv bird
    root      53214  53074  0 17:36 ?        00:00:00 bird6 -R -s /var/run/calico/bird6.ctl -d -c /etc/calico/confd/config/bird6.cfg
    root      53215  53075  0 17:36 ?        00:00:00 bird -R -s /var/run/calico/bird.ctl -d -c /etc/calico/confd/config/bird.cfg
    root      60956  55828  0 17:43 pts/0    00:00:00 grep --color=auto bird
    [root@master ~]# netstat -nap |grep LISTEN|grep bird
    tcp        0      0 0.0.0.0:179             0.0.0.0:*               LISTEN      53215/bird          
    unix  2      [ ACC ]     STREAM     LISTENING     228609567 53215/bird           /var/run/calico/bird.ctl
    unix  2      [ ACC ]     STREAM     LISTENING     228610782 53214/bird6          /var/run/calico/bird6.ctl

So I tried to apply it manually, but it didn't work either.

[root@worker /]# echo 'listen bgp port 11179' >> /etc/calico/confd/config/bird.cfg
[root@worker /]# grep 11179 /etc/calico/confd/config/bird.cfg
listen bgp port 11179
[root@worker /]# sv hup bird
[root@worker /]# exit
exit
command terminated with exit code 127
[root@master ~]# netstat -nap|grep LISTEN|grep bird
tcp        0      0 0.0.0.0:179             0.0.0.0:*               LISTEN      121692/bird    

Expected Behavior

Change BGP listen port. (179->11179)

Current Behavior

BGP listen port is not changed.

suiny commented 4 years ago

Anyone can answer? :(

song-jiang commented 3 years ago

@neiljerram Could you look into this?

nelljerram commented 3 years ago

@song-jiang @suiny Sure, I've started to take a look. Please be patient though, as I don't see a really obvious cause here!

nelljerram commented 3 years ago

@suiny Sorry for such a long hiatus. Is this issue still live for you?

So I tried to apply it manually, but it didn't work either.

I think you need a semicolon at the end of listen bgp port 11179.

Recommend birdcl configure, instead of sv hup bird, as birdcl will tell you if the new configuration is valid.

calicoctl get BGPConfiguration default

When you use BGPConfiguration to set the listen port, can you use cat /etc/calico/confd/config/bird.cfg to show us the BIRD config that you have?

suiny commented 3 years ago

@neiljerram thanks for reply I tried as you told, but listen port(179 -> 11179) doesn't change.

Generated by confd

include "bird_aggr.cfg"; include "bird_ipam.cfg";

router id 10.0.0.1;

Configure synchronization between routing tables and kernel.

protocol kernel { learn; # Learn all alien routes from the kernel persist; # Don't remove routes on bird shutdown scan time 2; # Scan kernel routing table every 2 seconds import all; export filter calico_kernel_programming; # Default is export none graceful restart; # Turn on graceful restart to reduce potential flaps in

routes when reloading BIRD configuration. With a full

                 # automatic mesh, there is no way to prevent BGP from
                 # flapping since multiple nodes update their BGP
                 # configuration at the same time, GR is not guaranteed to
                 # work correctly in this scenario.

}

Watch interface up/down events.

protocol device { debug { states }; scan time 2; # Scan interfaces every 2 seconds }

protocol direct { debug { states }; interface -"cali", -"kube-ipvs", ""; # Exclude cali and kube-ipvs* but

include everything else. In

                                      # IPVS-mode, kube-proxy creates a
                                      # kube-ipvs0 interface. We exclude
                                      # kube-ipvs0 because this interface
                                      # gets an address for every in use
                                      # cluster IP. We use static routes
                                      # for when we legitimately want to
                                      # export cluster IPs.

}

Template for all BGP clients

template bgp bgp_template { debug { states }; description "Connection to BGP peer"; local as 64512; multihop; gateway recursive; # This should be the default, but just in case. import all; # Import all routes, since we don't know what the upstream

topology is and therefore have to trust the ToR/RR.

export filter calico_export_to_bgp_peers; # Only want to export routes for workloads. source address 10.0.0.1; # The local address we use for the TCP connection add paths on; graceful restart; # See comment in kernel section about graceful restart. connect delay time 2; connect retry time 5; error wait time 5,30; }

------------- Node-to-node mesh -------------

For peer /host/worker/ip_addr_v4

Skipping ourselves (10.0.0.1)

For peer /host/worker02/ip_addr_v4

protocol bgp Mesh_10_0_1_1 from bgp_template { neighbor 10.0.1.1 as 64512; }

For peer /host/worker03/ip_addr_v4

protocol bgp Mesh_10_0_2_1 from bgp_template { neighbor 10.0.2.1 as 64512; passive on; # Mesh is unidirectional, peer will connect to us. }

------------- Global peers -------------

No global peers configured.

------------- Node-specific peers -------------

No node-specific peers configured.

listen bgp port 11179;

* Calico image version

[root@worker ~]# docker images |grep calico calico/node v3.16.3 f0d3b0d0e32c 8 months ago 164MB calico/pod2daemon-flexvol v3.16.3 a0b97353aa18 8 months ago 22.9MB calico/cni v3.16.3 fe49caa20c30 8 months ago 133MB

suiny commented 3 years ago

I checked that changes bgpport (179->11179) as the yaml file on the v3.18.4 version.

nelljerram commented 3 years ago

@suiny Thanks. Reviewing this whole issue, I have a guess at the problem.

[root@master ~]# calicoctl version Client Version: v3.16.3 Git commit: 7d066703 Cluster Version: v3.15.2 Cluster Type: kubespray,bgp,kubeadm,k8s

Note v3.15.2 - It looks like you haven't actually upgraded the running Calico components on each node. That would explain why the BGP port configuration was not reflected in your bird.cfg (until you added it manually).

suiny commented 3 years ago

That's strange. I upgraded calico to 3.16.3 but it was not reflected. (follow this link) However, the image was used as 3.16.3.

[root@master ~]# calicoctl version
Client Version:    v3.16.3
Git commit:        7d066703
Cluster Version:   v3.15.2
Cluster Type:      kubespray,bgp,kubeadm,k8s
[root@master ~]# docker images |grep calico
calico/node                                        v3.16.3             f0d3b0d0e32c        8 months ago        164MB
calico/pod2daemon-flexvol                          v3.16.3             a0b97353aa18        8 months ago        22.9MB
calico/cni                                         v3.16.3             fe49caa20c30        8 months ago        133MB
nelljerram commented 3 years ago

The docker image might not be same as what the Kubernetes runtime has available, or is actually using. What do you get for kubectl get ds calico-node -n calico-system -o yaml | grep image: ? (Or you might need kube-system instead of calico-system.)

suiny commented 3 years ago

@neiljerram

I used same image (v3.16.3)

[root@master ~]# kubectl get pods -n kube-system calico-node-89pm4 -oyaml | grep image:
            f:image: {}
            f:image: {}
            f:image: {}
            f:image: {}
    image: calico/node:v3.16.3
    image: calico/cni:v3.16.3
    image: calico/cni:v3.16.3
  - image: calico/pod2daemon-flexvol:v3.16.3
    image: calico/node:v3.16.3
    image: calico/cni:v3.16.3
    image: calico/cni:v3.16.3
    image: calico/pod2daemon-flexvol:v3.16.3
nelljerram commented 3 years ago

Looks good, thanks. Did you change that just now, or has it been like that for a long time?

suiny commented 3 years ago

It's been like that for a long time. The bgport did't change when using v3.16.3 version's image, and I think that it was a bug. I recently created a new cluster with v3.18.4 , the bgp port changed. It's goot that the bug is resolved.