osrg / gobgp

BGP implemented in the Go Programming Language
https://osrg.github.io/gobgp/
Apache License 2.0
3.64k stars 694 forks source link

deletePath not sending withdraw when connected to RR #1804

Closed astone282 closed 6 years ago

astone282 commented 6 years ago

Hello,

Doing some experimentation. I have gobgp configured (54.54.54.54) to talk to a vendor route reflector (35.121.146.13). Using the gobgp CLI or Java GRPC, when I addPath (ipv4), goBgp correctly sends the route and it is received on the RR.

The RR then echo's back the route with the RR cluster ID back to gobgp. I can see this route come in via monitor and also in the global rib dump.

When I try to deletePath, via CLI or Java GRPC, i do NOT see the [DELROUTE] log message. The entry is deleted from the gobgp but a withdraw message is not received on the RR. Therefore the route is not withdrawn.

Note: if i disable route-reflector on the other node, gobgp can successful add and withdraw the route. My guess is gobgp is not withdrawing the route it generated due to a collision with the reflected route

Here's a step through of what's happening:


# Add the route in gobgp

# ./gobgp global rib -a ipv4 add 5.5.5.5/24 nexthop 10.10.10.10
# gobgp monitor outputs the new route :

[ROUTE] 5.5.5.0/24 via 10.10.10.10 aspath [] attrs [{Origin: ?}]
# RR receives the update

116 2018/08/15 17:50:13.443 UTC MINOR: DEBUG #2001 Base Peer 1: 172.10.54.2
"Peer 1: 172.10.54.2: UPDATE
Peer 1: 172.10.54.2 - Received BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 21
    Flag: 0x40 Type: 1 Len: 1 Origin: 2
    Flag: 0x40 Type: 2 Len: 0 AS Path:
    Flag: 0x40 Type: 3 Len: 4 Nexthop: 10.10.10.10
    Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
    NLRI: Length = 4
        5.5.5.0/24
"
# RR sends a bgp update with the cluster is

117 2018/08/15 17:50:36.951 UTC MINOR: DEBUG #2001 Base Peer 1: 172.10.54.2
"Peer 1: 172.10.54.2: UPDATE
Peer 1: 172.10.54.2 - Send BGP UPDATE:
    Withdrawn Length = 0
    Total Path Attr Length = 35
    Flag: 0x40 Type: 1 Len: 1 Origin: 2
    Flag: 0x40 Type: 2 Len: 0 AS Path:
    Flag: 0x40 Type: 3 Len: 4 Nexthop: 10.10.10.10
    Flag: 0x40 Type: 5 Len: 4 Local Preference: 100
    Flag: 0x80 Type: 9 Len: 4 Originator ID: 54.54.54.54
    Flag: 0x80 Type: 10 Len: 4 Cluster ID:
        35.121.146.13
    NLRI: Length = 4
        5.5.5.0/24
"
# note: gobgp monitor does not have any new output

# gobgp rib output sees both paths
# ./gobgp global rib
   Network              Next Hop             AS_PATH              Age        Attrs
*> 5.5.5.0/24           10.10.10.10                               00:02:51   [{Origin: ?}]
*  5.5.5.0/24           10.10.10.10                               00:02:27   [{Origin: ?} {LocalPref: 100} {Originator: 54.54.54.54} {ClusterList: [35.121.146.13]}]
# Sending a delete, log on the RR is empty. Log on gobgp does not show [DELROUTE] but shows [ROUTE] of the entry from the RR

./gobgp global rib -a ipv4 del 5.5.5.5/24 nexthop 10.10.10.10

[ROUTE] 5.5.5.0/24 via 10.10.10.10 aspath [] attrs [{Origin: ?} {LocalPref: 100} {Originator: 54.54.54.54} {ClusterList: [35.121.146.13]}]
# Attempting to delete the route again is met with:

{"Key":"5.5.5.0/24","Path":{"nlri":{"prefix":"5.5.5.0/24"},"attrs":[{"type":1,"value":2},{"type":3,"nexthop":"10.10.10.10"}],"age":1534355705,"withdrawal":true,"validation":"none"},"Topic":"Table","level":"warning","msg":"No matching path for withdraw found, may be path was not installed into table","time":"2018-08-15T13:55:05-04:00"}
astone282 commented 6 years ago

Forgot to mention this is with v1.33.

Thanks!

yicwang commented 6 years ago

Just want to add more comments on this, and I am hitting about the same behavior. I guess the reason is, when GoBGP receives the echo from RR, it programs it RIB anyway even it should be ignored. The moment when you delete the route, the "echo route" immediately becomes the best, so GoBGP won't consider a route is "deleted", as the "echo route" is the best route now.

According to RFC4456, section 8: "A router that recognizes the ORIGINATOR_ID attribute SHOULD ignore a route received with its BGP Identifier as the ORIGINATOR_ID." Seems GoBGP is doing that.

Interestingly, I do see this part of code in the source code, when GoBGP selects the best route. However, my point is, this route should never be programmed into RIB, draft code fore review:

1807

fujita commented 6 years ago

@yicwang thanks for the comment. Other BGP implementations works in the same way (i.e., echo routes never be inserted into the RIB)?

yicwang commented 6 years ago

@fujita To be honest, I am not a BGP expert. We are using GoBGP EVPN to handle our VxLAN control plane protocol, and testing against a Cisco ASR9K. Just did a quick test, when GoBGP advertised the route with OriginatorID as the ASR9K's BGP update-source interface IP, I am not able to see it from the ASR9K rib table. Personally I believe it should be safe to not program the route into RIB.

PS: Just update the PR to address your review comments.

fujita commented 6 years ago

@yicwang pushed, thanks a lot.

fujita commented 6 years ago

@astone282 before investigating a bug, RR is expected to echo's back the route? if so, GoBGP with RR setup doesn't so you reported another bug.

astone282 commented 6 years ago

Hi @fujita @yicwang thanks for looking at this + the additional comments + PR.

@fujita I'm not a BGP expert either, but this is what the RFC 4456 section 6 describes: :

" When an RR receives a route from an IBGP peer, it selects the best path based on its path selection rule. After the best path is selected, it must do the following depending on the type of peer it is receiving the best path from

  1) A route from a Non-Client IBGP peer:

     Reflect to all the Clients.

  2) A route from a Client peer:

     Reflect to all the Non-Client peers and also to the Client
     peers.  (Hence the Client peers are not required to be fully
     meshed.)

"

Therefore I would interpret "All the clients" to include peer which originated the route. To prevent the loops, ORIGINATOR_ID should contain the original advertiser within that AS and CLUSTER_LIST must contain / be pre-pended with the RR cluster ID.

Hope this helps Andrew

yicwang commented 6 years ago

@fujita I think what aston282 originally reported maybe still is a bug, but it should no longer happen after my PR. I did the same test before on my EVPN use case. With the route reflected from RR, GoBGP RIB will have duplicated entries for the same information. One is the originated from local with the original information, one is echoed from RR with the same information, but additional attribute of "OriginatorID==GoBGP IP, and ClusterList==[RR IP]".

The bug I believe astone282 complains is, when we delete the original route, GoBGP never sends out the route withdraw message out, as it believes the echoed route is going to be the best now. I confirmed that too by doing tcpdump, and that was how I found out the problem and proposed the fix. So @astone282, I guess we can close this issue, as you shouldn't see the behavior any more with my PR. Try and see?

Correct me when wrong.

Regards, Yichen

astone282 commented 6 years ago

Yep, agreed @yicwang . Thanks! will grab master sometime next week, build and give it a go. Thanks again.

ffilippopoulos commented 6 years ago

@yicwang I have a similar case that is not covered by this PR. In my scenario I have 2 (or more) route reflectors that have the same clients but with with different cluster ids. These 2 rrs are connected (iBGP peers) so I end up having 2 identical entries for every route in the RIB. When one of the original routes gets deleted the route reflectors never exchange any withdraw message between each other and still maintain 1 entry for the route that is now dead. More details here: https://github.com/osrg/gobgp/issues/1842. I think that this is a valid bgp setup. Any suggestions? The PR linked above does not solve this one since it prevents echo messages from being added into the RIB which is different from my case.

yicwang commented 6 years ago

@ffilippopoulos I believe the reason why you need 3 GoBGP instances are for HA purpose. Can you try to set all three 3 GoBGP to have the same Router ID? They are essentially having the same information, which should logically make sense to do that... If they are have the same Router ID, the issue should go away with this PR.

ffilippopoulos commented 6 years ago

@yicwang yes that was made for HA purposes. I believe that if they all share the same id and after your PR, then they will reject all routes exchanged between them. So there is no point peering them in the first place.

yicwang commented 6 years ago

@ffilippopoulos Please correct me when wrong. Say you have 3 GoBGP instances running as RR, and 5 bird instances. By creating a mesh, all your 5 bird instances are connecting to 3 GoBGP instances, so total of 15 connections are there. Note that you don't really need to connect 3 GoBGP instances in a mesh, as they are all going to have the same information from 5 BIRD instances, and they are in Acitve-Active-Active HA already, right?

ffilippopoulos commented 6 years ago

yes, so my setup is exactly how you described it, 3 GoBGP instances (lets say RR1, RR2, RR3) and 5 bird instances (lets say B1, B2, B3, B4, B5). Every RR is connected with all bird instances so 15 connections in total. The reason for trying to peer RRs between each other is for example if the connections RR1-B1, RR2-B5 and RR3-B5 are all down there is no way for B1 to find B5 routes (the only way will be B5->RR2/RR3->RR1->B1). If all RRs have the same id and reject incoming routes based on that then there is no way to create a resilient setup like the above. BTW, thanks for all the help @yicwang