FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.21k stars 1.24k forks source link

Feature Request: learn default route and on-link route through ND in IPv6 network #8466

Open lihongguang opened 3 years ago

lihongguang commented 3 years ago

I'm looking to deploy frr and IPv6 SLAAC on an IPv6-only network, but I discovered that frr doesn't support this.

There are two problems about this on Cisco Community:

  1. https://learningnetwork.cisco.com/s/question/0D53i00000Kt7Fi/ipv6-address-autoconfig-and-its-nd-0-route
  2. https://learningnetwork.cisco.com/s/question/0D53i00000Kt1piCAB/solved-ipv6-router-does-not-get-its-default-route-through-slaac

Some information about IPv6 routes througth SLAAC :

router1 #show ipv6 route

IPv6 Routing Table - default - 3 entries

Codes: C - Connected, L - Local, S - Static, U - Per-user Static route

B - BGP, HA - Home Agent, MR - Mobile Router, R - RIP

H - NHRP, I1 - ISIS L1, I2 - ISIS L2, IA - ISIS interarea

IS - ISIS summary, D - EIGRP, EX - EIGRP external, NM - NEMO

ND - ND Default, NDp - ND Prefix, DCE - Destination, NDr - Redirect

O - OSPF Intra, OI - OSPF Inter, OE1 - OSPF ext 1, OE2 - OSPF ext 2

ON1 - OSPF NSSA ext 1, ON2 - OSPF NSSA ext 2, l - LISP

ND ::/0 [2/0] via FE80::1, GigabitEthernet1/0

NDp 2001::/64 [2/0] via GigabitEthernet1/0, directly connected

L FF00::/8 [0/0] via Null0, receive

taspelund commented 3 years ago

The Linux kernel already supports this natively and FRR handles routes from non-FRR sources (marked with proto 'K' in zebra), so this can already be done without modification to FRR.

For standard linux this is controlled by a few sysctl parameters, specifically accept_ra, accept_ra_defrtr and forwarding (for IPv6):

accept_ra - INTEGER
    Accept Router Advertisements; autoconfigure using them.

    It also determines whether or not to transmit Router
    Solicitations. If and only if the functional setting is to
    accept Router Advertisements, Router Solicitations will be
    transmitted.

    Possible values are:
        0 Do not accept Router Advertisements.
        1 Accept Router Advertisements if forwarding is disabled.
        2 Overrule forwarding behaviour. Accept Router Advertisements
          even if forwarding is enabled.

    Functional default: enabled if local forwarding is disabled.
                disabled if local forwarding is enabled.

accept_ra_defrtr - BOOLEAN
    Learn default router in Router Advertisement.

    Functional default: enabled if accept_ra is enabled.
                disabled if accept_ra is disabled.

forwarding - INTEGER
    Configure interface-specific Host/Router behaviour.

    Note: It is recommended to have the same setting on all
    interfaces; mixed router/host scenarios are rather uncommon.

    Possible values are:
        0 Forwarding disabled
        1 Forwarding enabled

    FALSE (0):

    By default, Host behaviour is assumed.  This means:

    1. IsRouter flag is not set in Neighbour Advertisements.
    2. If accept_ra is TRUE (default), transmit Router
       Solicitations.
    3. If accept_ra is TRUE (default), accept Router
       Advertisements (and do autoconfiguration).
    4. If accept_redirects is TRUE (default), accept Redirects.

    TRUE (1):

    If local forwarding is enabled, Router behaviour is assumed.
    This means exactly the reverse from the above:

    1. IsRouter flag is set in Neighbour Advertisements.
    2. Router Solicitations are not sent unless accept_ra is 2.
    3. Router Advertisements are ignored unless accept_ra is 2.
    4. Redirects are ignored.

Source: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

Since FRR runs on routers forwarding will generally be enabled on all interfaces. Accept_ra is generally set to 1 by default for most kernels, meaning autoconfiguration from RAs will be ignored unless forwarding is disabled.

So to make this work you want to ensure accept_ra is set to '2' and accept_ra_defrtr is set to '1' on interfaces you want to allow RA defaults to be installed through.

With these settings in place the kernel should install an onlink default route in response to receiving eligible RAs, which you'd see in show ip route with proto K.

lihongguang commented 3 years ago

I know Linux kernel has this capability, but how to deal with the following two scenarios below: https://blog.ipspace.net/2012/11/ipv6-router-advertisements-deep-dive.html RA messages may include prefix information. Each prefix has L and A flags: 1, set L and unset A: only generate prefix routes 2, uset L and set A: only generate ipv6 address

As we know, if two problems are not solved, FRR can't install the right routes.

taspelund commented 3 years ago

The creation of the on-link routes, SLAAC addresses, and default routes in response to a received RA are all handled by the kernel not FRR. After the addresses/routes have been installed by the kernel, they are visible/usable within FRR as routes with either proto Kernel or proto Connected -- in other words, if the kernel has done its job and setup the NDP address/routes correctly, then FRR will handle them like any other address or connected/default route.

For example:

Here's an ubuntu 18.04 VM I have up and running that is connected to another VM running FRR via enp1s0. I have enp1s0 configured with accept_ra=2 and autoconf=1 to ensure that the kernel will use SLAAC to generate any addresses/routes needed based on any RAs it receives.

[5:18:02] root@ub18:~                                                                                                                                                    
 # sysctl -a |& grep enp1s0 | grep 'autoconf\|accept_ra '                                                                                                                
net.ipv6.conf.enp1s0.accept_ra = 2                                                                                                                                       
net.ipv6.conf.enp1s0.autoconf = 1                                                                                                                                        

as you can see, there's only a link-local v6 address to begin with:

[5:18:05] root@ub18:~                                                               
 # ip a s enp1s0                                                                    
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000                                                                    
    link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff                                                                                                                   
    inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0                                                                                                       
       valid_lft forever preferred_lft forever                                                                                                                           
    inet6 fe80::5054:ff:fe11:e2e2/64 scope link                                                                                                                          
       valid_lft forever preferred_lft forever                   

Upon receiving an RA from the other VM, we can see that the kernel has autogenerated a local address, a subnet route matching the prefix in the RA, and a default route via the RA's link-local address:

[5:18:09] root@ub18:~                                                              
 # tcpdump -eni enp1s0 icmp6 -vvv
05:18:19.238499 52:54:00:9a:e2:ca > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 110: (flowlabel 0x21e7c, hlim 255, next-header ICMPv6 (58) payload length: 56) fe80::5054:ff:fe9a:e2ca > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 56
        hop limit 64, Flags [none], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms                                                            
          prefix info option (3), length 32 (4): 2001:dead:beef:cafe::/64, Flags [onlink, auto], valid time 2592000s, pref. time 604800s
            0x0000:  40c0 0027 8d00 0009 3a80 0000 0000 2001                                                                                                                         0x0010:  dead beef cafe 0000 0000 0000 0000                                                                                                                  
          source link-address option (1), length 8 (1): 52:54:00:9a:e2:ca           
            0x0000:  5254 009a e2ca

[5:22:50] root@ub18:~
 # ip a s enp1s0
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591776sec preferred_lft 604576sec
    inet6 fe80::5054:ff:fe11:e2e2/64 scope link 
       valid_lft forever preferred_lft forever

[5:22:52] root@ub18:~
 # ip -6 route show dev enp1s0
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591774sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium

and in FRR we can see the SLAAC address/routes:

[5:24:02] root@ub18:~
 # vtysh

Hello, this is FRRouting (version 7.7-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

ub18# show int vrf default br
Interface       Status  VRF             Addresses
---------       ------  ---             ---------
dummy20         up      default         
dummy30         up      default         
enp1s0          up      default         192.168.122.51/24
                                        + 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64
enp6s0          up      default         192.168.123.51/24
lo              up      default         1.1.1.1/32
                                        100.64.0.1/32
                                        100.64.0.11/32
                                        100.64.0.111/32
vni10           up      default         
vni20           up      default         
vni30           up      default         

ub18# show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, T - Table, A - Babel,
       D - SHARP, F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* ::/0 [0/1024] via fe80::5054:ff:fe9a:e2ca, enp1s0, 00:05:55
C>* 2001:dead:beef:cafe::/64 is directly connected, enp1s0, 00:05:53
K * 2001:dead:beef:cafe::/64 [0/1024] is directly connected, enp1s0, 00:05:55
C * fe80::/64 is directly connected, enp1s0, 00:06:47
C>* fe80::/64 is directly connected, enp6s0, 23:34:47

and the default route from NDP can be used in FRR like any other route:

ub18# conf t
ub18(config)# router bgp
ub18(config-router)# address-family ipv6 unicast 
ub18(config-router-af)# redistribute kernel 
ub18(config-router-af)# redistribute connected 
ub18(config-router-af)# do show bgp ipv6 unicast
BGP table version is 2, local router ID is 100.64.0.111, vrf id 0
Default local pref 100, local AS 65541
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> ::/0             fe80::5054:ff:fe9a:e2ca
                                          1024         32768 ?
*> 2001:dead:beef:cafe::/64
                    ::                       0         32768 ?

Displayed  2 routes and 2 total paths

So like I mentioned before, the situation you described (learn default/on-link routes through ND) already works using native kernel logic, and the routes can be used/redistributed by routing protocol daemons.

If you're seeing issues with how the kernel is handling certain RA messages/flags, then it would probably make more sense to figure out why those issues are happening than to move the handling of received RA's away from the kernel and into FRR.

If there's a specific situation/issue you're having troubles with, feel free to reach out on the FRRouting slack channel and we can try to give help/ideas.

taspelund commented 3 years ago

For what it's worth, I did also try configuring the other FRR VM to send 2 other prefixes in the RA (2001::/64 with L=1/A=0, 3001::/64 other with L=0/A=1) and it seems that the kernel running on 18.04 does what you'd expect based on those flags:

[5:42:13] root@ub18:~                                                               
 # tcpdump -eni enp1s0 icmp6 -vvv                                                                                                                                        
tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
05:43:42.752386 52:54:00:9a:e2:ca > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 174: (flowlabel 0x21e7c, hlim 255, next-header ICMPv6 (58) payload length: 120) fe80::5054:ff:fe9a:e2ca > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 120
        hop limit 64, Flags [none], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms
          prefix info option (3), length 32 (4): 2001:dead:beef:cafe::/64, Flags [onlink, auto], valid time 2592000s, pref. time 604800s
            0x0000:  40c0 0027 8d00 0009 3a80 0000 0000 2001
            0x0010:  dead beef cafe 0000 0000 0000 0000
          prefix info option (3), length 32 (4): 2001::/64, Flags [onlink], valid time 2592000s, pref. time 604800s
            0x0000:  4080 0027 8d00 0009 3a80 0000 0000 2001
            0x0010:  0000 0000 0000 0000 0000 0000 0000
          prefix info option (3), length 32 (4): 3001::/64, Flags [auto], valid time 2592000s, pref. time 604800s
            0x0000:  4040 0027 8d00 0009 3a80 0000 0000 3001
            0x0010:  0000 0000 0000 0000 0000 0000 0000
          source link-address option (1), length 8 (1): 52:54:00:9a:e2:ca
            0x0000:  5254 009a e2ca

[5:46:20] root@ub18:~
 # ip a s enp1s0              
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591842sec preferred_lft 604642sec
    inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591842sec preferred_lft 604642sec
    inet6 fe80::5054:ff:fe11:e2e2/64 scope link 
       valid_lft forever preferred_lft forever

[5:46:23] root@ub18:~
 # ip -6 route show dev enp1s0
2001::/64 proto kernel metric 256 expires 2591839sec pref medium
2001::/64 proto ra metric 1024 pref medium
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591839sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium

[5:46:25] root@ub18:~
 # uname -r
4.15.0-140-generic

SLAAC created an address in 3001::/64 but no onlink route, and an onlink route for 2001::/64 with no address.

lihongguang commented 3 years ago

For what it's worth, I did also try configuring the other FRR VM to send 2 other prefixes in the RA (2001::/64 with L=1/A=0, 3001::/64 other with L=0/A=1) and it seems that the kernel running on 18.04 does what you'd expect based on those flags:

[5:42:13] root@ub18:~                                                               
 # tcpdump -eni enp1s0 icmp6 -vvv                                                                                                                                        
tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
05:43:42.752386 52:54:00:9a:e2:ca > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 174: (flowlabel 0x21e7c, hlim 255, next-header ICMPv6 (58) payload length: 120) fe80::5054:ff:fe9a:e2ca > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 120
        hop limit 64, Flags [none], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms
          prefix info option (3), length 32 (4): 2001:dead:beef:cafe::/64, Flags [onlink, auto], valid time 2592000s, pref. time 604800s
            0x0000:  40c0 0027 8d00 0009 3a80 0000 0000 2001
            0x0010:  dead beef cafe 0000 0000 0000 0000
          prefix info option (3), length 32 (4): 2001::/64, Flags [onlink], valid time 2592000s, pref. time 604800s
            0x0000:  4080 0027 8d00 0009 3a80 0000 0000 2001
            0x0010:  0000 0000 0000 0000 0000 0000 0000
          prefix info option (3), length 32 (4): 3001::/64, Flags [auto], valid time 2592000s, pref. time 604800s
            0x0000:  4040 0027 8d00 0009 3a80 0000 0000 3001
            0x0010:  0000 0000 0000 0000 0000 0000 0000
          source link-address option (1), length 8 (1): 52:54:00:9a:e2:ca
            0x0000:  5254 009a e2ca

[5:46:20] root@ub18:~
 # ip a s enp1s0              
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591842sec preferred_lft 604642sec
    inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591842sec preferred_lft 604642sec
    inet6 fe80::5054:ff:fe11:e2e2/64 scope link 
       valid_lft forever preferred_lft forever

[5:46:23] root@ub18:~
 # ip -6 route show dev enp1s0
2001::/64 proto kernel metric 256 expires 2591839sec pref medium
2001::/64 proto ra metric 1024 pref medium
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591839sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium

[5:46:25] root@ub18:~
 # uname -r
4.15.0-140-generic

SLAAC created an address in 3001::/64 but no onlink route, and an onlink route for 2001::/64 with no address.

Could you attach the route information from the zebra rib througth vtysh, especially the condition RA with the prefix route 3001::/64 L=0/A=1? I remember there was something wrong with the situation,that one connect route (3001::/64) installed in the zebra rib.

taspelund commented 3 years ago
[14:22:52] root@ub18:~
 # ip a s enp1s0              
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591978sec preferred_lft 604778sec
    inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591978sec preferred_lft 604778sec
    inet6 fe80::5054:ff:fe11:e2e2/64 scope link 
       valid_lft forever preferred_lft forever

[14:22:55] root@ub18:~
 # ip -6 route show dev enp1s0
2001::/64 proto kernel metric 256 expires 2591974sec pref medium
2001::/64 proto ra metric 1024 pref medium
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591974sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium

[14:22:58] root@ub18:~
 # vtysh -c 'show ipv6 route'
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, T - Table, A - Babel,
       D - SHARP, F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* ::/0 [0/1024] via fe80::5054:ff:fe9a:e2ca, enp1s0, 00:00:16
K>* 2001::/64 [0/1024] is directly connected, enp1s0, 00:00:16
K * 2001:dead:beef:cafe::/64 [0/1024] is directly connected, enp1s0, 00:00:16
C>* 2001:dead:beef:cafe::/64 is directly connected, enp1s0, 00:00:16
C>* 3001::/64 is directly connected, enp1s0, 00:00:16
C * fe80::/64 is directly connected, enp6s0, 00:00:16
C>* fe80::/64 is directly connected, enp1s0, 00:00:16
lihongguang commented 3 years ago

C>* 3001::/64 is directly connected, enp1s0, 00:00:16

This connect route shouldn‘t in zebra rib,and it doesn't exist in kernel, actually.

taspelund commented 3 years ago

Sure, but that's a separate discussion from what you filed this issue for.

Zebra derives the "connected" routes using the netmask/prefix length of the addresses on the interface, i.e. inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute -- the connected /64 route comes from the fact that the prefix length for the local address is 64. Based on the current logic, it doesn't come from the presence of a route in the kernel.

If zebra were to rely solely on routes with an egress interface and no next-hop (effectively onlink) then any routes without a next-hop would be considered Connected, which is inaccurate. It may be possible to update the way zebra detects "connected" routes, but I doubt that would be a trivial change given it has the potential to impact all IPv6 connected/onlink routes.

As it sits now, the topic you filed this ticket for is a non-issue since default + on-link routes are learned from NDP and are present in the RIB.

If you want to open up a separate discussion around updating the way Connected routes are detected that's fine, but I think that should be its own discussion and warrants its own github issue.

donaldsharp commented 3 years ago

no what we are doing is correct imo. There is no need for a discussion. @taspelund is correctly describing FRR behavior

lihongguang commented 3 years ago

Thank your for your answer. Sure, but I do hope that FRR can deal with this problem just like cisco/juniper/huawei routers.

There are two problems about this on Cisco Community: https://learningnetwork.cisco.com/s/question/0D53i00000Kt7Fi/ipv6-address-autoconfig-and-its-nd-0-route https://learningnetwork.cisco.com/s/question/0D53i00000Kt1piCAB/solved-ipv6-router-does-not-get-its-default-route-through-slaac

lihongguang commented 3 years ago

no what we are doing is correct imo. There is no need for a discussion. @taspelund is correctly describing FRR behavior

I understand what @taspelund describes, but I don't agree with you on "what we are doing is correct". And I think that whether FRR behavior is right or wrong directly depends on the scene of network equipments running.

taspelund commented 3 years ago

Regarding the topic in the subject line of this issue, the Cisco links don't really explain what you're requesting. In the examples from either community discussion you linked, the "default-route" keyword is effectively equivalent to ensuring you've configured the sysctl parameters correctly on the interface receiving RAs (forwarding=1, accept_ra=2, accept_ra_defrtr=1, autoconfig=1), which we've seen above works just fine. If that doesn't address your question, then it sounds like what you're wanting is zebra to change the way it displays these routes. It sounds like you want something to indicate that a kernel route came from NDP (maybe a new "proto"?) as apposed to the routes appearing with proto connected/kernel like they do today. Is that correct?

As for the Connected route for RA prefixes with A-flag only, I agree that today's behavior doesn't line up 1:1 with what you'd intuitively expect, but I'm not understanding what you're functionally unable to accomplish with FRR the way it operates today.

Can you give a real world example of when you'd use SLAAC to autogenerate an address but not an onlink route? Or a real world example where the connected route in the RIB taken from the prefix length of the SLAAC address creates an issue that cannot be worked around using a route-map or similar config in FRR?

It's hard to say that FRR is doing the wrong thing without understanding specifically what problems are being caused by the existing behavior.