troglobit / smcroute

Static multicast routing for UNIX
https://troglobit.com/projects/smcroute/
GNU General Public License v2.0
244 stars 64 forks source link

Removing multicast routes with smcroutectl del does not work #167

Closed bombadil closed 3 years ago

bombadil commented 3 years ago

Looking at the autopkgtest in the Debian packaging for the 2.5.2 release, the removal of multicast routes seems to be unreliable now:

$ sudo perl debian/tests/mr-cache-ipv4
1..9
ok 1 - smcroute is running (pid 3585)
ok 2 - At least one multicast capable interface found: wlan0
ok 3 - Multicast routing cache is empty
ok 4 - adding multicast route 10.0.0.1->wlan0->wlan0->224.0.1.20 doesn't fail (return code: 0)
ok 5 - adding multicast route 10.0.0.1->wlan0->wlan0->224.0.1.20 doesn't generate any console output
ok 6 - Multicast routing cache now contains one entry
# Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
# 140100E0 0100000A 0          0        0        0  0:1  
# Command: smcroute -r wlan0 10.0.0.1 224.0.1.20
ok 7 - removing multicast route 10.0.0.1->wlan0->wlan0->224.0.1.20 doesn't fail (return code: 0)
ok 8 - removing multicast route 10.0.0.1->wlan0->wlan0->224.0.1.20 doesn't generate any console output
not ok 9 - Multicast routing cache is empty again
#   Failed test 'Multicast routing cache is empty again'
#   at debian/tests/mr-cache-ipv4 line 57.
#          got: '1'
#     expected: '0'
# Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
# 140100E0 0100000A 0          0        0        0
# Looks like you failed 1 test of 9.
$ cat /proc/net/ip_mr_cache
Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
140100E0 0100000A 0          0        0        0

Running smcroutectl flush didn't help:

$ sudo smcroutectl flush
$ cat /proc/net/ip_mr_cache
Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
140100E0 0100000A 0          0        0        0
$ 

After waiting a few minutes, the routing cache clears itself without any action.

Also the corresponding IPv6 test is failing in the same way.

Expectation Running scmroutectl remove ... removes the matching route immediately, as it was the case with smcroute 2.4.4, where this very same test is passing.

troglobit commented 3 years ago

There has been a slight change in semantics, maybe that is what is happening here?

A call to smcroutectl remove removes the route if there are no outbound interfaces removed. If outbound interfaces are included in the call, only those interfaces are removed. Hence, if all outbound interfaces are included all those interfaces are removed, but the stop route remains.

bombadil commented 3 years ago

Here are some debug logs:

Sep 21 10:51:22 hostname smcroute[4514]: SMCRoute v2.5.2
Sep 21 10:51:22 hostname smcroute[4514]: Found new interface lo, adding ...
Sep 21 10:51:22 hostname smcroute[4514]: Found new interface wlan0, adding ...
Sep 21 10:51:22 hostname smcroute[4514]: Found new interface virbr1, adding ...
Sep 21 10:51:22 hostname smcroute[4514]: Found lo, updating ...
Sep 21 10:51:22 hostname smcroute[4514]: Removing multicast VIFs for lo
Sep 21 10:51:22 hostname smcroute[4514]: Failed deleting VIF for iface lo: Resource temporarily unavailable
Sep 21 10:51:22 hostname smcroute[4514]: Failed deleting MIF for iface lo: Resource temporarily unavailable
Sep 21 10:51:22 hostname smcroute[4514]: Found wlan0, updating ...
Sep 21 10:51:22 hostname smcroute[4514]: Found virbr1, updating ...
Sep 21 10:51:22 hostname smcroute[4514]: Found virbr1, updating ...
Sep 21 10:51:22 hostname smcroute[4514]: Found lo, updating ...
Sep 21 10:51:22 hostname smcroute[4514]: Removing multicast VIFs for lo
Sep 21 10:51:22 hostname smcroute[4514]: Failed deleting VIF for iface lo: Resource temporarily unavailable
Sep 21 10:51:22 hostname smcroute[4514]: Failed deleting MIF for iface lo: Resource temporarily unavailable
Sep 21 10:51:22 hostname smcroute[4514]: Found wlan0, updating ...
Sep 21 10:51:22 hostname smcroute[4514]: Interface lo is not multicast capable, skipping VIF.
Sep 21 10:51:22 hostname smcroute[4514]: Map iface wlan0            => VIF 0  ifindex  2 flags 0x0008 TTL threshold 1
Sep 21 10:51:22 hostname smcroute[4514]: Map iface virbr1           => VIF 1  ifindex  3 flags 0x0008 TTL threshold 1
Sep 21 10:51:22 hostname smcroute[4514]: Interface lo is not multicast capable, skipping MIF.
Sep 21 10:51:22 hostname smcroute[4514]: Map iface wlan0            => MIF 0  ifindex  2 flags 0x0000 TTL threshold 1
Sep 21 10:51:22 hostname smcroute[4514]: Map iface virbr1           => MIF 1  ifindex  3 flags 0x0000 TTL threshold 1
Sep 21 10:51:22 hostname smcroute[4514]: NOFILE: current 1024 max 1048576
Sep 21 10:51:22 hostname smcroute[4514]: NOFILE: set new current 1048576 max 1048576
Sep 21 10:51:22 hostname smcroute[4514]: Binding IPC socket to /run/smcroute.sock
Sep 21 10:51:22 hostname smcroute[4514]: Creating PID file /run/smcroute.pid
Sep 21 10:51:22 hostname smcroute[4514]: Ready, waiting for client request or kernel event.
Sep 21 10:51:33 hostname smcroute[4514]: ipc: mroute: checking for input iface wlan0 ...
Sep 21 10:51:33 hostname smcroute[4514]: ipc: mroute: input iface wlan0 has vif 0
Sep 21 10:51:33 hostname smcroute[4514]: ipc: mroute: checking for wlan0 ...
Sep 21 10:51:33 hostname smcroute[4514]: ipc: mroute: Same outbound interface (wlan0) as inbound (wlan0) may cause routing loops.
Sep 21 10:51:33 hostname smcroute[4514]: mroute: adding route from wlan0 (10.0.0.1/32,224.0.1.20/32)
Sep 21 10:51:33 hostname smcroute[4514]: Add 10.0.0.1 -> 224.0.1.20 from VIF 0
Sep 21 10:51:33 hostname smcroute[4514]: ipc: mroute: checking for input iface wlan0 ...
Sep 21 10:51:33 hostname smcroute[4514]: ipc: mroute: input iface wlan0 has vif 0
Sep 21 10:51:33 hostname smcroute[4514]: mroute: deleting route from wlan0 (10.0.0.1/32,224.0.1.20/32)
Sep 21 10:51:33 hostname smcroute[4514]: Add 10.0.0.1 -> 224.0.1.20 from VIF 0
Sep 21 10:52:22 hostname smcroute[4514]: Cache timeout, flushing unused (*,G) routes!
Sep 21 10:53:22 hostname smcroute[4514]: Cache timeout, flushing unused (*,G) routes!
Sep 21 10:53:22 hostname smcroute[4514]: Checking (10.0.0.1,224.0.1.20) on wlan0, time to expire: last 2390 max 60 now: 2450
Sep 21 10:53:22 hostname smcroute[4514]:   -> Yup, stale route.
Sep 21 10:53:22 hostname smcroute[4514]: Del 10.0.0.1 -> 224.0.1.20 from VIF 0
Sep 21 10:53:38 hostname smcroute[4514]: Exiting.

What surprises me is that I see contradicting log messages. It talks about "deleting route from wlan0" but then adds a route to VIF0 (could be a typo in the log message though).

Sep 21 10:51:33 hostname smcroute[4514]: mroute: deleting route from wlan0 (10.0.0.1/32,224.0.1.20/32)
Sep 21 10:51:33 hostname smcroute[4514]: Add 10.0.0.1 -> 224.0.1.20 from VIF 0
bombadil commented 3 years ago

I think I don't understand your comment about the remaining stop route.

The sequence of executed commands during test execution is:

smcroute -a wlan0 10.0.0.1 224.0.1.20 wlan0
smcroute -r wlan0 10.0.0.1 224.0.1.20
troglobit commented 3 years ago

Yeah that doesn't look right at all :-/

Thanks, I've reproduced locally and will add another test case for this based on https://salsa.debian.org/debian/smcroute/-/blob/master/debian/tests/mr-cache-ipv4 -- I'll see if I can find some time later tonight to investigate.

troglobit commented 3 years ago

Seems to be isolated to removing routes at runtime (IPC add/del), reloading .conf files with single route removed or interfaces changed are not affected.

troglobit commented 3 years ago

There, should be fixed in 2b81485. With this I think it's better to do another patch release, rather than your having to backport yet another fix. I need it anyway to push downstream to Buildroot.

Thanks for taking the time to report and explain it to me! I was heavily inspired by the Debian tests when I added the test suite during the summer, yet I somehow never added this set of basic tests. This has now been remedied in test/mrcache.sh and test/mrcache6.sh.