projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
5.97k stars 1.33k forks source link

`gateway recursive` option in bird.cfg breaks as-per-rack topology #1515

Closed tomas-mazak closed 6 years ago

tomas-mazak commented 6 years ago

Expected Behavior

topology_example

This is my testing cluster topology (AS-per-rack, no separate route-reflectors). When peering is set up between each node and its gateway (iBGP, same local and neighbour AS), the routing is expected to work across cluster.

Current Behavior

Currently on the nodes, routes with next-hop within the local subnet are good, while routes propagated from spine are in unreachable state.

Possible Solution

I believe this is due to the gateway recursive setting in bird.cfg. In my setup this was resolved by changing multihop to direct and gateway recursive to gateway direct.

I have noticed that these settings are hard-coded in the template. Would it be possible to make it configurable per peer?

Context

I am trying to setup an AS-per-rack topology, where the nodes peer with ToR Switch directly, without dedicated route-reflectors.

Your Environment

fasaxc commented 6 years ago

Hmm, I wonder if you've got a slightly different router config to the one we're expecting for that topology. Is your ToR sending a default route down to the host?

tomas-mazak commented 6 years ago

It is. And I use the ToR as a route reflector. Since this is a virtual setup, all the routers are BIRD instances. So if you are interested in some configs or sh proto|route listings, I can provide them.

nelljerram commented 6 years ago

@tomas-mazak You say you use the ToRs (leaf-a and leaf-b, I assume) as route reflectors. But don't they need to be in the data path, and so full (non-RR) BGP speakers?

tomas-mazak commented 6 years ago

@neiljerram that's very true. Actually, they do both. In order to avoid node mesh peering within the rack, I only peer the nodes with the ToR (leaf).

I would like it to both reflect the routes within the rack (rr function) and propagate the routes from the spine. With rr client setting in leaf config and because of iBGP function (all the same AS), the leaf advertises all routes to all nodes without changing next-hop.

Then with gateway direct in the node config, the node installs the routes as-is if the next-hop is directly reachable, otherwise it changes the next hop to the advertising router (ToR), which is exactly what I want to achieve.

Perhaps I could write a filter on the leaf to change next hop to self for routes from the spine only, but it seems to me more complex and error prone (and I was not yet able to figure out how to do it exactly).

Sorry for my poor terminology, I am new to routing protocols.

caseydavenport commented 6 years ago

Actually, they do both. In order to avoid node mesh peering within the rack, I only peer the nodes with the ToR (leaf).

Just passing through, but perhaps part of the solution here is to not put your Calico nodes in the same AS as their TOR/leaf so the ToR doesn't need to function as a RR? As @neiljerram suggested, I think setting rr client on the ToR might be an issue here.