jech / babeld

The Babel routing daemon
http://www.irif.fr/~jch/software/babel/
MIT License
385 stars 92 forks source link

Added id to input filter #113

Closed Dando-Real-ITA closed 7 months ago

Dando-Real-ITA commented 9 months ago

This allows to have pref-src set by router id, for example to have source routing for IPv4

# ISP1
install ip 0.0.0.0/0 eq 0 id 0e:6a:a3:ff:fe:a7:00:00 pref-src 66.199.5.162
# ISP2
install ip 0.0.0.0/0 eq 0 id 0e:08:1c:ff:fe:dd:00:00 pref-src 12.144.66.186
Dando-Real-ITA commented 9 months ago
Dando-Real-ITA commented 8 months ago

Added more code that allows the socket config to change a redistribute route metric allowing to retract or reannounce a route.

Example: Route defined in babeld.conf as redistribute ip 0.0.0.0/0 eq 0 metric 1000

Retract:

echo "redistribute ip 0.0.0.0/0 eq 0 deny" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null
echo "check_xroutes" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null

Reannounce:

echo "redistribute ip 0.0.0.0/0 eq 0 metric 1000" | socat - UNIX-CONNECT:/var/run/babeld.sock
echo "check_xroutes" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null
jech commented 8 months ago

There appears to be multiple (related?) functionalities in this pull request, and I'm not quite sure what problems each of those is solving. I'd be grateful if you could squash related functionality into a single patch. For example, the two newpref_src patches should be a single commit.

Here's a first review (just having a quick look):

I haven't reviewed the remaining patches, since I don't understand what problem you're trying to solve.

Dando-Real-ITA commented 8 months ago

Ok I will on Monday

The context is: 2 edge routers connected to 2 ISPs, using the provider public IPs, and redistributing the default route. Clients have IPs from both providers ( 2 IPv4 and 2 IPv6 ).

Goal is to keep connectivity even in case one ISP connection fails

Problem 1: Default routes for each ISP need to be used with the correct source ip. This is the problem described in the source specific prefix document, with 2 caveats:

Problem 2: Edge routers should be able to retract default routes if they detect the upstream connection is failing. Simply deleting the default route did not work, and also is not ideal for the edge router that needs to check when connectivity is back

I am not sure it is the best way, but this is how I made work the use case I was modeling:

Solution 1a: Distribute the default route normally from the edge routers, then let the clients install with the correct source ip. To differentiate the default routes received, id was added to install filter.

Solution 1b: When a default route was updated from isp1 to isp2, the src field of the route was not updated. I identified that always the original ‘pref_src’ field was used, so I added ‘newpref_src’ to support pref_src switch

Solution 2a: Enable metric change of live redistribution filters. This way, a route can be retracted or allowed without touching the kernel route.

Solution 2b: Enable a special trigger of kernel_dump. Normally, if a redistribute filter has metric deny, the route is just ignored and no further action is performed. But with a manual retraction, it was necessary to not skip normal processing and allow babeld to notify the route change. Thus a new flag has been created for check_xroutes which is only set for the special case of manual issuing the command, annd it allows infinity redistribute metric to be processed instead of being silently ignored.

jech commented 8 months ago

Edge routers should be able to retract default routes if they detect the upstream connection is failing. Simply deleting the default route did not work, and also is not ideal for the edge router that needs to check when connectivity is back

This one is actually easy, and requires no new mechanism. On each of your edge routers, install a fake default route with low priority:

ip route add 0.0.0.0/0 dev lo metric 65534 proto 43

Then redistribute this route, but don't redistribute the real default route:

redistribute ip 0.0.0.0/0 le 0 proto 43 allow
redistribute ip 0.0.0.0/0 le 0 deny

Now when connectivity fails, remove the redistributed route:

ip route del 0.0.0.0/0 dev lo metric 65534 proto 43

and add it back when connectivity resumes. You may use babel-pinger in order to handle the fake route automatically.

Dando-Real-ITA commented 8 months ago

Interesting, I'll test the default route + babel-ping and clean the commits for the id and newpref_src Do you have also a solution for propagating the source specific return routes? As of now I am defining them manually in the client like this:

      routes:
        # Default routes are learned from babel
        # Tables for return traffic with correct source address set to correct ISP
        # ISP1, table 1
        - from: "2001:4870:24a:500:2:0:0:1d03"
          to: default
          via: "2001:4870:24a:500:2:0:0:1d01"
          table: 1
        # ISP2, table 2
        - from: "2001:1890:1f76:4400:2:0:0:1d03"
          to: default
          via: "2001:1890:1f76:4400:2:0:0:1d00"
          table: 2
      routing-policy:
        # Select table for return traffic based on the source IP of the response
        # ISP 1
        - from: "2001:4870:24a:500:2:0:0:1d03"
          table: 1
        # ISP 2
        - from: "2001:1890:1f76:4400:2:0:0:1d03"
          table: 2
jech commented 8 months ago

Do you have also a solution for propagating the source specific return routes?

No, I don't think I have. Either we reinstate source-specific routing for IPv4 (which I removed because it was complex code and I thought nobody was using it), or we implement your solution. I'll wait until you've had the opportunity clean up your patches, and think about it some more.

Let me know if you want me to set up a git repository for babel-pinger, it could do with some tweaking (in particular, it's almost completely undocumented).

As to check_xroute, I really think the xroute check should be trigerred automatically, I'd rather not expose this implementation detail in the UI.

Dando-Real-ITA commented 8 months ago

I reduced for now the patches to only install filter id and newpref_src fix ( 3 commits ). I'll think about the correct way to trigger an update, with the lo route trick, SIGUSR2 on the pid should be enough, or add a simplified command that updates the timers as you said

I gave a quick look to babel-pinger, I noticed it uses only IPv4 addresses and routes. It would make sense for it to also trigger the babel update post route change

I'll think more about the patch to change live the redistribute metric, I recognize it was more pervasive and I don't know if it can have an use ( pops to my mind some form of dynamic multipath when using multiple upstream routers with prefix aggregation which is the next thing I want to simulate )

For the install source specific I have a simple but hacky idea to test for default routes

jech commented 8 months ago

At first sight, this looks very good. It's late now, so I may be saying something stupid, but shouldn't kernel_route take a prefsrc in all cases, not just in the change case?

Dando-Real-ITA commented 8 months ago

For add and flush there is only one prefsrc anyway, and only one set of route parameters.

For change_route_metric, it recreates the route but the prefsrc does not change, thus by default newprefsrc = prefsrc

Then in kernel_route the field is passed separately to ensure that the recursive calls on modify work correctly in all cases, removing the old route with the old prefsrc and creating the new with the newprefsrc

Dando-Real-ITA commented 7 months ago

Pull is part of https://github.com/jech/babeld/pull/114 Master branch has a more pervasive change to support dual default route distribution when generated by different routers