Open tlsalmin opened 3 years ago
For a workaround, iterate any CHANGE objects. If they have more than one nexthop, loop over the nexthops with iter1, then loop over iter1 + 1 (iter2). If
rtnl_route_nh_compare(iter1, iter2, 0xffffffff, true)
matches, remove iter1 and iterate again until no more matches.
Hi, I'm not sure if this is the same issue, but I've encountered a problem with duplicated nexthops while I was working on IPv6 ECMP routes. I've resolved it by adding a small patch to libnl which now I've added as a PR #290. I hope it will help you!
Sorry doesn't help with the issue. Applied the patch to 3.5.0 and get the same results
~/src/random_tests ~./nl_mngr_bug &
[1] 73033
~/src/random_tests ~ip -6 r
::1 dev lo proto kernel metric 256 pref medium
anycast fe80:: dev enp34s0 proto kernel metric 0 pref medium
anycast fe80:: dev tun0 proto kernel metric 0 pref medium
fe80::/64 dev enp34s0 proto kernel metric 256 pref medium
fe80::/64 dev tun0 proto kernel metric 256 pref medium
multicast ff00::/8 dev enp34s0 proto kernel metric 256 pref medium
multicast ff00::/8 dev tun0 proto kernel metric 256 pref medium
~/src/random_tests ~sudo ip -6 r a fd99::100/128 via fe80::58e5:27ff:fea6:f594 dev enp34s0
Received 1 of route [inet6 fd99::100 table main type unicast
scope global priority 0x400 protocol boot
nexthop via fe80::58e5:27ff:fea6:f594 dev enp34s0 <>
]
~/src/random_tests ~sudo ip -6 r chg fd99::100/128 via fe80::4609:b8ff:fe4e:1a1b dev enp34s0
Received 5 of route [inet6 fd99::100 table main type unicast
scope global priority 0x400 protocol boot
nexthop via fe80::58e5:27ff:fea6:f594 dev enp34s0 <>
nexthop via fe80::4609:b8ff:fe4e:1a1b dev enp34s0 <>
]
~/src/random_tests ~sudo ip -6 r chg fd99::100/128 via fe80::885b:7aff:fe5f:653c dev enp34s0
Received 5 of route [inet6 fd99::100 table main type unicast
scope global priority 0x400 protocol boot
nexthop via fe80::58e5:27ff:fea6:f594 dev enp34s0 <>
nexthop via fe80::4609:b8ff:fe4e:1a1b dev enp34s0 <>
nexthop via fe80::885b:7aff:fe5f:653c dev enp34s0 <>
]
~/src/random_tests ~ip -6 r
::1 dev lo proto kernel metric 256 pref medium
fd99::100 via fe80::885b:7aff:fe5f:653c dev enp34s0 metric 1024 pref medium
anycast fe80:: dev enp34s0 proto kernel metric 0 pref medium
anycast fe80:: dev tun0 proto kernel metric 0 pref medium
fe80::/64 dev enp34s0 proto kernel metric 256 pref medium
fe80::/64 dev tun0 proto kernel metric 256 pref medium
multicast ff00::/8 dev enp34s0 proto kernel metric 256 pref medium
multicast ff00::/8 dev tun0 proto kernel metric 256 pref medium
~/src/random_tests ~ip -6 r d fd99::100
RTNETLINK answers: Operation not permitted
~/src/random_tests ~sudo ip -6 r d fd99::100
Received 5 of route [inet6 fd99::100 table main type unicast
scope global priority 0x400 protocol boot
nexthop via fe80::58e5:27ff:fea6:f594 dev enp34s0 <>
nexthop via fe80::4609:b8ff:fe4e:1a1b dev enp34s0 <>
]
~/src/random_tests ~
Also my workaround doesn't work with change. I think its the same as the patch. This only works to remove duplicates of the same, but doesn't fix behaviour when NLM_F_REPLACE is in the netlink message.
You're right, I was facing an issue with having a nexthop X for example 3 times in the same route and this is why the patch helped.
I don't see lib/route/route_obj.c taking NLM_F_REPLACE into consideration at all. At least the kernel side in fib6_add_rt2node removes all siblings (nexthops) when NLM_F_REPLACE is present if they are purely RTF_GATEWAY routes.
Added pull request for fix https://github.com/thom311/libnl/pull/293
Given the test program that uses nl cache manager to listen for route updates:
Any IPv6 route changes will cause the cache handled by nl_cache_mngr to go out of sync:
I'll have a look later to patch this. But if someone happens to have more time, go ahead.
Submitted on behalf of Forcepoint Finland.
libnl version was libnl-3.5.0 on gentoo.