ClusterLabs / resource-agents

Combined repository of OCF agents from the RHCS and Linux-HA projects
GNU General Public License v2.0
493 stars 583 forks source link

IPaddr2/findif: more than 1 matching routes error #1976

Closed RichardVine closed 2 months ago

RichardVine commented 2 months ago

Similar to [#1963], I'm facing a problem using IPv6 addresses with IPAddr2 since the 4.15 release of Resource Agents due to error 'More than 1 routes match nn:nn:nn:nn/128. Unable to decide which route to use.'

In my case, the issue is different because there really are multiple routes that match:

ip -o -f inet6 route list match 2222:3333:4444::a00:8/128 | grep -v "^\(unreachable\|prohibit\|blackhole\)" | grep "dev br0 " | sed -e 's,^\([0-9.]\+\) ,\1/32 ,;s,^\([0-9a-f:]\+\) ,\1/128 ,' | sort -t/ -k2,2nr | grep -v "^default"
2222:3333:4444::/64 dev br0 proto kernel metric 256 pref medium
2222:3333:4444::/64 dev br0 proto ra metric 1024 expires 6870sec pref medium
2222:3333:4444::/48 via fe80::b2f2:8ff:fedf:bb27 dev br0 proto ra metric 1024 expires 1470sec pref medium

Two of these routes are learnt via Router Announcements with different network masks (/48 and /64), one is 'kernel' assigned. Based on the routes in the routing table, I can see why findif complains that there are 3 matching routes.

However, from the commit notes, I can't see the relevance of this check, which was not part of the prior 4.14 release. If I comment out the relevant section of code as follows, 'findif' continues without any issue and the IP address is added to the correct interface.

#  if [ $(echo "$routematch" | wc -l) -gt 1 ]; then
#    ocf_exit_reason "More than 1 routes match $match. Unable to decide which route to use."
#    return $OCF_ERR_GENERIC
#  fi

It's possible I'm adding IPv6 addresses incorrectly; I've seen other examples where they are added to the loopback interface. This 'works' though the IP address is then inaccessible across the network so I'm likely misunderstanding how this should work.

Other than hacking out this check, which is presumably there for a reason, any suggestions on how I can should use IPaddr2 to add an IPv6 address to a Pacemaker clusters?

Thanks.

oalbrigt commented 2 months ago

The agent is made for use with static IPs, as you dont want your router to give the same IP to another device by accident: https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/IPaddr2#L191-L196

The check is there to avoid the possibility of the agent using the incorrect route when there are more matches than 1.

RichardVine commented 2 months ago

Thanks.

The interfaces I'm using do have static IPv4 and IPv6 address; it's only the IPv6 routes that are leant dynamically via RAs from the router that are causing this check to 'fail'.

What I'm confused about is the purpose of findif when I've already specified both the interface to use and the netmask. From what I can see, it returns the following back to IPAddr2:

echo "$nic netmask $netmask broadcast $brdcast metric $metric"

The NIC and netmask are already known (in my case), there is no broadcast address for IPv6 and it's only the metric that is obtained. Based on the routes found, this would either be 256 or 1024 - by disabling the duplicate check, it uses the first match which is 256.

2222:3333:4444::a00:8/128 dev br0 proto kernel metric 256 pref medium   <==
2222:3333:4444::/64 dev br0 proto kernel metric 256 pref medium
2222:3333:4444::/64 dev br0 proto ra metric 1024 expires 6888sec pref medium
2222:3333:4444::/48 via fe80::b2f2:8ff:fedf:bb27 dev br0 proto ra metric 1024 expires 1488sec pref medium

It's the presence of the RA learnt routes that seems to be the issue. It seems that findif nearly has an option that could deal with that.

routematch=$(ip -o -f $family route list match $match $proto $scope | grep -v "^\(unreachable\|prohibit\|blackhole\)" | grep "dev $nic " | sed -e 's,^\([0-9.]\+\) ,\1/32 ,;s,^\([0-9a-f:]\+\) ,\1/128 ,' | sort -t/ -k2,2nr)

If $proto was set to 'proto kernel', it would only pick up the single kernel route and ignore the RA routes. However, although that variable seems to exist within findif, it doesn't seem possible to use it (without changing the code) and it's always null as far as I can tell.

I'll experiment with that and see if it works!

oalbrigt commented 2 months ago

We use that in IPsrcaddr, so I guess we can copy that over to IPaddr2.

RichardVine commented 2 months ago

Thank you, but I'd hold fire for now; I think even hard coding proto as 'proto kernel' in findif is still finding duplicate routes which I don't understand.

From a command line, the following is working ok and only matches on a single route:

ip -o -f inet6 route list match 2222:3333:4444::a00:8/128 proto kernel

But when I modify findif and start the resource, it seems to create the IP address, then duplicate routes are found: the 'duplicate' is infact the IP address that was just created so not sure what's happening there. I'll gets some debug traces and work out what's going on.

RichardVine commented 2 months ago

After some further testing, using 'proto kernel' on the 'ip rout list match' command works when starting the IPv6 address. Only a single kernel route is found so the IP address is added normally.

However, the monitor command then fails because it find two routes. One is the normal kernel route previously found, but the other is the route for the address newly added.

For this trace, 2222:3333:4444:0:5:5:a00:5/128 has previously been added successfully by the start command, then the monitor command is checking:

+++ 07:50:58: findif:222: ip -o -f inet6 route list match 2222:3333:4444:0:5:5:a00:5/128 proto kernel
+++ 07:50:58: findif:222: grep 'dev br0 '
+++ 07:50:58: findif:222: sed -e 's,^\([0-9.]\+\) ,\1/32 ,;s,^\([0-9a-f:]\+\) ,\1/128 ,'
+++ 07:50:58: findif:222: sort -t/ -k2,2nr
++ 07:50:58: findif:222: routematch='2222:3333:4444:0:5:5:a00:5/128 dev br0 metric 256 pref medium
2222:3333:4444::/64 dev br0 metric 256 pref medium'
++ 07:50:58: findif:226: '[' inet6 = inet6 ']'
+++ 07:50:58: findif:227: echo '2222:3333:4444:0:5:5:a00:5/128 dev br0 metric 256 pref medium
2222:3333:4444::/64 dev br0 metric 256 pref medium'
+++ 07:50:58: findif:227: grep -v '^default'
++ 07:50:58: findif:227: routematch='2222:3333:4444:0:5:5:a00:5/128 dev br0 metric 256 pref medium
2222:3333:4444::/64 dev br0 metric 256 pref medium'
+++ 07:50:58: findif:230: echo '2222:3333:4444:0:5:5:a00:5/128 dev br0 metric 256 pref medium
2222:3333:4444::/64 dev br0 metric 256 pref medium'
+++ 07:50:58: findif:230: wc -l
++ 07:50:58: findif:230: '[' 2 -gt 1 ']'
++ 07:50:58: findif:231: ocf_exit_reason 'More than 1 routes match 2222:3333:4444:0:5:5:a00:5/128. Unable to decide which route to use.'

Slightly difficult to read as the 'routematch' entries show two lines 'merged', but 'ip rout' found two routes:

2222:3333:4444:0:5:5:a00:5/128 dev br0 metric 256 pref medium
2222:3333:4444::/64 dev br0 metric 256 pref medium

Because the start/monitor/stop happens very quickly, I can't show the output from 'ip rout match' while this is happening. With the duplicate check temporarily disabled, 'ip -o -f inet6 route list match 2222:3333:4444::5:5:a00:5/128 proto kernel' results in the following routes:

Start, IP address not yet active:

2222:3333:4444::/64 dev br0 metric 256 pref medium

Monitor, after IP address active:

2222:3333:4444:0:5:5:a00:5 dev br0 metric 256 pref medium     <== route added for new VIP
2222:3333:4444::/64 dev br0 metric 256 pref medium

Stop, IP address deactivated:

2222:3333:4444::/64 dev br0 metric 256 pref medium

So despite using 'proto kernel' to hide the 'ra' learnt routes, there are still duplicate routes when the IP address is active, which confuses the monitor check.

Having seen this, I don't understand how duplicate checking can ever work for IPv6 addresses (routes do not get added like this for IPv4 addresses); duplicates will always be found once the IPv6 address is active.

At least, that's the case on my system so perhaps it's related to the Linux kernel and IP route versions, which are:

iproute2-6.10.0 kernel-6.10.8

At least this has confirmed that enabling 'proto kernel' in IPAddr2/findif won't resolve this issue so unless there are other reasons, there's probably no point exploring that further.

So far, the only workaround I've found is to continue with the duplicate check disabled; the only apparent consequence is that the 'wrong' metric field may be retrieved as the nic and network mask are already known. Presumably before the duplicate check was implemented, this would have been the case anyway.

oalbrigt commented 2 months ago

Great info.

It seems like the issue is due to you using a /128 IP here, which creates another subnet in the routing table. If you use /64 instead it should work without any issues.

I'll do some testing and see if we can fix that issue as well.

RichardVine commented 2 months ago

Thank you. I originally used /128 in an attempt to stop this becoming a source IP address but then found the 'preferred_lft' parameter should be used for that. I'll try /64 as you've suggested, when I get some downtime later!

oalbrigt commented 2 months ago

I also got suggestions that you might want to set the gateway manually on the interface to avoid getting it from RA.

Or manually setting the IP/gateway on the devices (without using dhcp/ra).

RichardVine commented 2 months ago

Good suggestion. It's complicated by the router using a dynamically assigned ULA with no obvious option to override this, but I'll look into it.

oalbrigt commented 2 months ago

Thank you. I originally used /128 in an attempt to stop this becoming a source IP address but then found the 'preferred_lft' parameter should be used for that. I'll try /64 as you've suggested, when I get some downtime later!

You can avoid it becoming the source IP by setting lvs_ipv6_addrlabel=true. There's a fix for newer or future releases of resource-agents to have it set that way by default to use IPsrcaddr to manage the source part (also requires another recent patch, so IPsrcaddr IPv6 might not yet work on your distro): https://github.com/ClusterLabs/resource-agents/pull/1951/files

RichardVine commented 2 months ago

I've changed the IPv6 addresses so they're now /64 and as suggested, additional routes no longer get defined for these addresses.

Setting 'lvs_ipv6_addrlabel=true' also works to stop these addresses being used as source IPs, even with 'preferred_lft=0' removed.

So both of these have had positive results so great suggestions.

I've not yet done anything about stopping the 'RA' discovered routes, but coding 'proto=proto kernel' within findif.sh for the ip rout command hides them and no duplicate routes are found.

Having said earlier that 'proto kernel' probably wasn't worth exploring for IPAddr2, it seems that it does infact help where RA routes are used, at least in my case.

oalbrigt commented 2 months ago

Great. So does it fail if you remove 'proto=proto kernel' from findif?

RichardVine commented 2 months ago

Yes, with 'proto kernel' removed, it still fails with 'duplicate routes' because it sees the RA learnt routes, eg:

# ip -o -f inet6 route list match 2222:3333:4444::a00:8/64 | grep -v "^\(unreachable\|prohibit\|blackhole\)" | grep "dev br0 " | sed -e 's,^\([0-9.]\+\) ,\1/32 ,;s,^\([0-9a-f:]\+\) ,\1/128 ,' | sort -t/ -k2,2nr | grep -v "^default"
2222:3333:4444::/64 dev br0 proto kernel metric 256 pref medium
2222:3333:4444::/64 dev br0 proto ra metric 1024 pref medium
2222:3333:4444::/48 via fe80::b2f2:8ff:fedf:bb27 dev br0 proto ra metric 1024 expires 1745sec pref medium

With 'prot kernel', it returns only the single kernel route that is installed when the first static IPv6 address is created (ie. at boot up, well before any virtual addresses are added later):

# ip -o -f inet6 route list match 2222:3333:4444::a00:8/64 proto kernel | grep -v "^\(unreachable\|prohibit\|blackhole\)" | grep "dev br0 " | sed -e 's,^\([0-9.]\+\) ,\1/32 ,;s,^\([0-9a-f:]\+\) ,\1/128 ,' | sort -t/ -k2,2nr | grep -v "^default"
2222:3333:4444::/64 dev br0 metric 256 pref medium
oalbrigt commented 2 months ago

Thanks for confirming.

I made https://github.com/ClusterLabs/resource-agents/pull/1980 which will let you specify 'proto=kernel' in your Pacemaker config.

RichardVine commented 2 months ago

Thank you.