Open lihongguang opened 3 years ago
The Linux kernel already supports this natively and FRR handles routes from non-FRR sources (marked with proto 'K' in zebra), so this can already be done without modification to FRR.
For standard linux this is controlled by a few sysctl parameters, specifically accept_ra
, accept_ra_defrtr
and forwarding
(for IPv6):
accept_ra - INTEGER
Accept Router Advertisements; autoconfigure using them.
It also determines whether or not to transmit Router
Solicitations. If and only if the functional setting is to
accept Router Advertisements, Router Solicitations will be
transmitted.
Possible values are:
0 Do not accept Router Advertisements.
1 Accept Router Advertisements if forwarding is disabled.
2 Overrule forwarding behaviour. Accept Router Advertisements
even if forwarding is enabled.
Functional default: enabled if local forwarding is disabled.
disabled if local forwarding is enabled.
accept_ra_defrtr - BOOLEAN
Learn default router in Router Advertisement.
Functional default: enabled if accept_ra is enabled.
disabled if accept_ra is disabled.
forwarding - INTEGER
Configure interface-specific Host/Router behaviour.
Note: It is recommended to have the same setting on all
interfaces; mixed router/host scenarios are rather uncommon.
Possible values are:
0 Forwarding disabled
1 Forwarding enabled
FALSE (0):
By default, Host behaviour is assumed. This means:
1. IsRouter flag is not set in Neighbour Advertisements.
2. If accept_ra is TRUE (default), transmit Router
Solicitations.
3. If accept_ra is TRUE (default), accept Router
Advertisements (and do autoconfiguration).
4. If accept_redirects is TRUE (default), accept Redirects.
TRUE (1):
If local forwarding is enabled, Router behaviour is assumed.
This means exactly the reverse from the above:
1. IsRouter flag is set in Neighbour Advertisements.
2. Router Solicitations are not sent unless accept_ra is 2.
3. Router Advertisements are ignored unless accept_ra is 2.
4. Redirects are ignored.
Source: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
Since FRR runs on routers forwarding will generally be enabled on all interfaces. Accept_ra is generally set to 1 by default for most kernels, meaning autoconfiguration from RAs will be ignored unless forwarding is disabled.
So to make this work you want to ensure accept_ra is set to '2' and accept_ra_defrtr is set to '1' on interfaces you want to allow RA defaults to be installed through.
With these settings in place the kernel should install an onlink default route in response to receiving eligible RAs, which you'd see in show ip route
with proto K.
I know Linux kernel has this capability, but how to deal with the following two scenarios below: https://blog.ipspace.net/2012/11/ipv6-router-advertisements-deep-dive.html RA messages may include prefix information. Each prefix has L and A flags: 1, set L and unset A: only generate prefix routes 2, uset L and set A: only generate ipv6 address
As we know, if two problems are not solved, FRR can't install the right routes.
The creation of the on-link routes, SLAAC addresses, and default routes in response to a received RA are all handled by the kernel not FRR. After the addresses/routes have been installed by the kernel, they are visible/usable within FRR as routes with either proto Kernel or proto Connected -- in other words, if the kernel has done its job and setup the NDP address/routes correctly, then FRR will handle them like any other address or connected/default route.
For example:
Here's an ubuntu 18.04 VM I have up and running that is connected to another VM running FRR via enp1s0.
I have enp1s0 configured with accept_ra=2
and autoconf=1
to ensure that the kernel will use SLAAC to generate any addresses/routes needed based on any RAs it receives.
[5:18:02] root@ub18:~
# sysctl -a |& grep enp1s0 | grep 'autoconf\|accept_ra '
net.ipv6.conf.enp1s0.accept_ra = 2
net.ipv6.conf.enp1s0.autoconf = 1
as you can see, there's only a link-local v6 address to begin with:
[5:18:05] root@ub18:~
# ip a s enp1s0
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe11:e2e2/64 scope link
valid_lft forever preferred_lft forever
Upon receiving an RA from the other VM, we can see that the kernel has autogenerated a local address, a subnet route matching the prefix in the RA, and a default route via the RA's link-local address:
[5:18:09] root@ub18:~
# tcpdump -eni enp1s0 icmp6 -vvv
05:18:19.238499 52:54:00:9a:e2:ca > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 110: (flowlabel 0x21e7c, hlim 255, next-header ICMPv6 (58) payload length: 56) fe80::5054:ff:fe9a:e2ca > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 56
hop limit 64, Flags [none], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms
prefix info option (3), length 32 (4): 2001:dead:beef:cafe::/64, Flags [onlink, auto], valid time 2592000s, pref. time 604800s
0x0000: 40c0 0027 8d00 0009 3a80 0000 0000 2001 0x0010: dead beef cafe 0000 0000 0000 0000
source link-address option (1), length 8 (1): 52:54:00:9a:e2:ca
0x0000: 5254 009a e2ca
[5:22:50] root@ub18:~
# ip a s enp1s0
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
valid_lft forever preferred_lft forever
inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591776sec preferred_lft 604576sec
inet6 fe80::5054:ff:fe11:e2e2/64 scope link
valid_lft forever preferred_lft forever
[5:22:52] root@ub18:~
# ip -6 route show dev enp1s0
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591774sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium
and in FRR we can see the SLAAC address/routes:
[5:24:02] root@ub18:~
# vtysh
Hello, this is FRRouting (version 7.7-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
ub18# show int vrf default br
Interface Status VRF Addresses
--------- ------ --- ---------
dummy20 up default
dummy30 up default
enp1s0 up default 192.168.122.51/24
+ 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64
enp6s0 up default 192.168.123.51/24
lo up default 1.1.1.1/32
100.64.0.1/32
100.64.0.11/32
100.64.0.111/32
vni10 up default
vni20 up default
vni30 up default
ub18# show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, T - Table, A - Babel,
D - SHARP, F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* ::/0 [0/1024] via fe80::5054:ff:fe9a:e2ca, enp1s0, 00:05:55
C>* 2001:dead:beef:cafe::/64 is directly connected, enp1s0, 00:05:53
K * 2001:dead:beef:cafe::/64 [0/1024] is directly connected, enp1s0, 00:05:55
C * fe80::/64 is directly connected, enp1s0, 00:06:47
C>* fe80::/64 is directly connected, enp6s0, 23:34:47
and the default route from NDP can be used in FRR like any other route:
ub18# conf t
ub18(config)# router bgp
ub18(config-router)# address-family ipv6 unicast
ub18(config-router-af)# redistribute kernel
ub18(config-router-af)# redistribute connected
ub18(config-router-af)# do show bgp ipv6 unicast
BGP table version is 2, local router ID is 100.64.0.111, vrf id 0
Default local pref 100, local AS 65541
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> ::/0 fe80::5054:ff:fe9a:e2ca
1024 32768 ?
*> 2001:dead:beef:cafe::/64
:: 0 32768 ?
Displayed 2 routes and 2 total paths
So like I mentioned before, the situation you described (learn default/on-link routes through ND) already works using native kernel logic, and the routes can be used/redistributed by routing protocol daemons.
If you're seeing issues with how the kernel is handling certain RA messages/flags, then it would probably make more sense to figure out why those issues are happening than to move the handling of received RA's away from the kernel and into FRR.
If there's a specific situation/issue you're having troubles with, feel free to reach out on the FRRouting slack channel and we can try to give help/ideas.
For what it's worth, I did also try configuring the other FRR VM to send 2 other prefixes in the RA (2001::/64 with L=1/A=0, 3001::/64 other with L=0/A=1) and it seems that the kernel running on 18.04 does what you'd expect based on those flags:
[5:42:13] root@ub18:~
# tcpdump -eni enp1s0 icmp6 -vvv
tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
05:43:42.752386 52:54:00:9a:e2:ca > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 174: (flowlabel 0x21e7c, hlim 255, next-header ICMPv6 (58) payload length: 120) fe80::5054:ff:fe9a:e2ca > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 120
hop limit 64, Flags [none], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms
prefix info option (3), length 32 (4): 2001:dead:beef:cafe::/64, Flags [onlink, auto], valid time 2592000s, pref. time 604800s
0x0000: 40c0 0027 8d00 0009 3a80 0000 0000 2001
0x0010: dead beef cafe 0000 0000 0000 0000
prefix info option (3), length 32 (4): 2001::/64, Flags [onlink], valid time 2592000s, pref. time 604800s
0x0000: 4080 0027 8d00 0009 3a80 0000 0000 2001
0x0010: 0000 0000 0000 0000 0000 0000 0000
prefix info option (3), length 32 (4): 3001::/64, Flags [auto], valid time 2592000s, pref. time 604800s
0x0000: 4040 0027 8d00 0009 3a80 0000 0000 3001
0x0010: 0000 0000 0000 0000 0000 0000 0000
source link-address option (1), length 8 (1): 52:54:00:9a:e2:ca
0x0000: 5254 009a e2ca
[5:46:20] root@ub18:~
# ip a s enp1s0
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
valid_lft forever preferred_lft forever
inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591842sec preferred_lft 604642sec
inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591842sec preferred_lft 604642sec
inet6 fe80::5054:ff:fe11:e2e2/64 scope link
valid_lft forever preferred_lft forever
[5:46:23] root@ub18:~
# ip -6 route show dev enp1s0
2001::/64 proto kernel metric 256 expires 2591839sec pref medium
2001::/64 proto ra metric 1024 pref medium
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591839sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium
[5:46:25] root@ub18:~
# uname -r
4.15.0-140-generic
SLAAC created an address in 3001::/64 but no onlink route, and an onlink route for 2001::/64 with no address.
For what it's worth, I did also try configuring the other FRR VM to send 2 other prefixes in the RA (2001::/64 with L=1/A=0, 3001::/64 other with L=0/A=1) and it seems that the kernel running on 18.04 does what you'd expect based on those flags:
[5:42:13] root@ub18:~ # tcpdump -eni enp1s0 icmp6 -vvv tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes 05:43:42.752386 52:54:00:9a:e2:ca > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 174: (flowlabel 0x21e7c, hlim 255, next-header ICMPv6 (58) payload length: 120) fe80::5054:ff:fe9a:e2ca > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 120 hop limit 64, Flags [none], pref medium, router lifetime 1800s, reachable time 0ms, retrans timer 0ms prefix info option (3), length 32 (4): 2001:dead:beef:cafe::/64, Flags [onlink, auto], valid time 2592000s, pref. time 604800s 0x0000: 40c0 0027 8d00 0009 3a80 0000 0000 2001 0x0010: dead beef cafe 0000 0000 0000 0000 prefix info option (3), length 32 (4): 2001::/64, Flags [onlink], valid time 2592000s, pref. time 604800s 0x0000: 4080 0027 8d00 0009 3a80 0000 0000 2001 0x0010: 0000 0000 0000 0000 0000 0000 0000 prefix info option (3), length 32 (4): 3001::/64, Flags [auto], valid time 2592000s, pref. time 604800s 0x0000: 4040 0027 8d00 0009 3a80 0000 0000 3001 0x0010: 0000 0000 0000 0000 0000 0000 0000 source link-address option (1), length 8 (1): 52:54:00:9a:e2:ca 0x0000: 5254 009a e2ca [5:46:20] root@ub18:~ # ip a s enp1s0 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0 valid_lft forever preferred_lft forever inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute valid_lft 2591842sec preferred_lft 604642sec inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute valid_lft 2591842sec preferred_lft 604642sec inet6 fe80::5054:ff:fe11:e2e2/64 scope link valid_lft forever preferred_lft forever [5:46:23] root@ub18:~ # ip -6 route show dev enp1s0 2001::/64 proto kernel metric 256 expires 2591839sec pref medium 2001::/64 proto ra metric 1024 pref medium 2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591839sec pref medium 2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium fe80::/64 proto kernel metric 256 pref medium default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium [5:46:25] root@ub18:~ # uname -r 4.15.0-140-generic
SLAAC created an address in 3001::/64 but no onlink route, and an onlink route for 2001::/64 with no address.
Could you attach the route information from the zebra rib througth vtysh, especially the condition RA with the prefix route 3001::/64 L=0/A=1? I remember there was something wrong with the situation,that one connect route (3001::/64) installed in the zebra rib.
[14:22:52] root@ub18:~
# ip a s enp1s0
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:11:e2:e2 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.51/24 brd 192.168.122.255 scope global enp1s0
valid_lft forever preferred_lft forever
inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591978sec preferred_lft 604778sec
inet6 2001:dead:beef:cafe:5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 2591978sec preferred_lft 604778sec
inet6 fe80::5054:ff:fe11:e2e2/64 scope link
valid_lft forever preferred_lft forever
[14:22:55] root@ub18:~
# ip -6 route show dev enp1s0
2001::/64 proto kernel metric 256 expires 2591974sec pref medium
2001::/64 proto ra metric 1024 pref medium
2001:dead:beef:cafe::/64 proto kernel metric 256 expires 2591974sec pref medium
2001:dead:beef:cafe::/64 proto ra metric 1024 pref medium
fe80::/64 proto kernel metric 256 pref medium
default via fe80::5054:ff:fe9a:e2ca proto ra metric 1024 hoplimit 64 pref medium
[14:22:58] root@ub18:~
# vtysh -c 'show ipv6 route'
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, T - Table, A - Babel,
D - SHARP, F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* ::/0 [0/1024] via fe80::5054:ff:fe9a:e2ca, enp1s0, 00:00:16
K>* 2001::/64 [0/1024] is directly connected, enp1s0, 00:00:16
K * 2001:dead:beef:cafe::/64 [0/1024] is directly connected, enp1s0, 00:00:16
C>* 2001:dead:beef:cafe::/64 is directly connected, enp1s0, 00:00:16
C>* 3001::/64 is directly connected, enp1s0, 00:00:16
C * fe80::/64 is directly connected, enp6s0, 00:00:16
C>* fe80::/64 is directly connected, enp1s0, 00:00:16
C>* 3001::/64 is directly connected, enp1s0, 00:00:16
This connect route shouldn‘t in zebra rib,and it doesn't exist in kernel, actually.
Sure, but that's a separate discussion from what you filed this issue for.
Zebra derives the "connected" routes using the netmask/prefix length of the addresses on the interface, i.e. inet6 3001::5054:ff:fe11:e2e2/64 scope global dynamic mngtmpaddr noprefixroute
-- the connected /64 route comes from the fact that the prefix length for the local address is 64. Based on the current logic, it doesn't come from the presence of a route in the kernel.
If zebra were to rely solely on routes with an egress interface and no next-hop (effectively onlink) then any routes without a next-hop would be considered Connected, which is inaccurate. It may be possible to update the way zebra detects "connected" routes, but I doubt that would be a trivial change given it has the potential to impact all IPv6 connected/onlink routes.
As it sits now, the topic you filed this ticket for is a non-issue since default + on-link routes are learned from NDP and are present in the RIB.
If you want to open up a separate discussion around updating the way Connected routes are detected that's fine, but I think that should be its own discussion and warrants its own github issue.
no what we are doing is correct imo. There is no need for a discussion. @taspelund is correctly describing FRR behavior
Thank your for your answer. Sure, but I do hope that FRR can deal with this problem just like cisco/juniper/huawei routers.
There are two problems about this on Cisco Community: https://learningnetwork.cisco.com/s/question/0D53i00000Kt7Fi/ipv6-address-autoconfig-and-its-nd-0-route https://learningnetwork.cisco.com/s/question/0D53i00000Kt1piCAB/solved-ipv6-router-does-not-get-its-default-route-through-slaac
no what we are doing is correct imo. There is no need for a discussion. @taspelund is correctly describing FRR behavior
I understand what @taspelund describes, but I don't agree with you on "what we are doing is correct". And I think that whether FRR behavior is right or wrong directly depends on the scene of network equipments running.
Regarding the topic in the subject line of this issue, the Cisco links don't really explain what you're requesting. In the examples from either community discussion you linked, the "default-route" keyword is effectively equivalent to ensuring you've configured the sysctl parameters correctly on the interface receiving RAs (forwarding=1, accept_ra=2, accept_ra_defrtr=1, autoconfig=1), which we've seen above works just fine. If that doesn't address your question, then it sounds like what you're wanting is zebra to change the way it displays these routes. It sounds like you want something to indicate that a kernel route came from NDP (maybe a new "proto"?) as apposed to the routes appearing with proto connected/kernel like they do today. Is that correct?
As for the Connected route for RA prefixes with A-flag only, I agree that today's behavior doesn't line up 1:1 with what you'd intuitively expect, but I'm not understanding what you're functionally unable to accomplish with FRR the way it operates today.
Can you give a real world example of when you'd use SLAAC to autogenerate an address but not an onlink route? Or a real world example where the connected route in the RIB taken from the prefix length of the SLAAC address creates an issue that cannot be worked around using a route-map or similar config in FRR?
It's hard to say that FRR is doing the wrong thing without understanding specifically what problems are being caused by the existing behavior.
I'm looking to deploy frr and IPv6 SLAAC on an IPv6-only network, but I discovered that frr doesn't support this.
There are two problems about this on Cisco Community:
https://learningnetwork.cisco.com/s/question/0D53i00000Kt1piCAB/solved-ipv6-router-does-not-get-its-default-route-through-slaac
Some information about IPv6 routes througth SLAAC :
router1 #show ipv6 route
IPv6 Routing Table - default - 3 entries
Codes: C - Connected, L - Local, S - Static, U - Per-user Static route
B - BGP, HA - Home Agent, MR - Mobile Router, R - RIP
H - NHRP, I1 - ISIS L1, I2 - ISIS L2, IA - ISIS interarea
IS - ISIS summary, D - EIGRP, EX - EIGRP external, NM - NEMO
ND - ND Default, NDp - ND Prefix, DCE - Destination, NDr - Redirect
O - OSPF Intra, OI - OSPF Inter, OE1 - OSPF ext 1, OE2 - OSPF ext 2
ON1 - OSPF NSSA ext 1, ON2 - OSPF NSSA ext 2, l - LISP
ND ::/0 [2/0] via FE80::1, GigabitEthernet1/0
NDp 2001::/64 [2/0] via GigabitEthernet1/0, directly connected
L FF00::/8 [0/0] via Null0, receive