troglobit / pimd

PIM-SM/SSM multicast routing for UNIX and Linux
http://troglobit.com/projects/pimd/
BSD 3-Clause "New" or "Revised" License
197 stars 87 forks source link

Multicast routing table not updated on FreeBSD #57

Closed ocochard closed 8 years ago

ocochard commented 8 years ago

Here is my simple setup with 2 routers (R2 and R3) when I ask to use igmpv2:

mcast server (em0) --- (em0) R2 (em1) ---- (em1) R3 (em2) ---- (em2) subscriber

[root@R2]~# pimd -v
pimd version 2.3.0

[root@R2]~# cat /usr/local/etc/pimd.conf
cand_rp 10.0.23.2 time 10 priority 1

[root@R3]~# cat /usr/local/etc/pimd.conf
cand_bootstrap_router 10.0.23.3 priority 1

[root@R3]~# pimd -dd
debug level 0x7 (dvmrp_prunes,dvmrp_routes,dvmrp_neighbors)
22:48:45.635 pimd version 2.3.0 starting ...
22:48:45.635 Getting vifs from kernel
22:48:45.636 Installing em1 (10.0.23.3 on subnet 10.0.23/24) as vif #0 - rate 0
22:48:45.636 Installing em2 (10.0.34.3 on subnet 10.0.34/24) as vif #1 - rate 0
22:48:45.636 Getting vifs from /usr/local/etcpimd.conf
22:48:45.639 Interface em1 comes up; vif #0 now in service
22:48:45.643 Interface em2 comes up; vif #1 now in service
22:48:45.647 Interface register_vif0 comes up; vif #2 now in service

[root@R2]~# pimd -dd
debug level 0x7 (dvmrp_prunes,dvmrp_routes,dvmrp_neighbors)
22:48:47.392 pimd version 2.3.0 starting ...
22:48:47.392 Getting vifs from kernel
22:48:47.392 Installing em0 (10.0.12.2 on subnet 10.0.12/24) as vif #0 - rate 0
22:48:47.393 Installing em1 (10.0.23.2 on subnet 10.0.23/24) as vif #1 - rate 0
22:48:47.393 Getting vifs from /usr/local/etcpimd.conf
22:48:47.394 Interface em0 comes up; vif #0 now in service
22:48:47.398 Interface em1 comes up; vif #1 now in service
22:48:47.402 Interface register_vif0 comes up; vif #2 now in service

Then I ask the subscriber to subscribe to an mcast group:

[root@R4]~# mtest
multicast membership test program; enter ? for list of commands
j 239.1.1.1 em2 10.0.12.1
ok

R3 correctly notice this new membership:

22:50:14.242 Switch shortest path (SPT): src 10.0.12.1, group 239.1.1.1
22:50:15.643 Switch shortest path (SPT): src 10.0.12.1, group 239.1.1.1
22:50:20.843 Switch shortest path (SPT): src 10.0.12.1, group 239.1.1.1

But its multicast routes of PIM router R3 are still empty:

[root@R3]~# pimd -r
Virtual Interface Table ======================================================
Vif  Local Address    Subnet              Thresh  Flags      Neighbors
---  ---------------  ------------------  ------  ---------  -----------------
  0  10.0.23.3        10.0.23/24               1  DR PIM     10.0.23.2
  1  10.0.34.3        10.0.34/24               1  DR NO-NBR
  2  10.0.23.3        register_vif0            1

 Vif  SSM Group        Sources

Multicast Routing Table ======================================================
--------------------------------- (*,*,G) ------------------------------------
Number of Groups: 0
Number of Cache MIRRORs: 0
------------------------------------------------------------------------------

=> Should't it display this new mcast group ?

troglobit commented 8 years ago

The group is only displayed in the PIM routing table when it's been "established" properly. Unlike DVRMP (mrouted), in PIM the multicast isn't flooded. A few questions:

  1. Can the sender and the receiver ping each other, i.e. have the unicast routing table been set up properly? DVMRP (mrouted) has sort of RIP built-in, but PIM needs the unicast routing table to be established first.
  2. What does the mtest tool do? Does it set the TTL to 3 or higher?

One of my basic tests when verifying pimd is to ue ping -t 10 225.1.2.3 out the proper interface from the sender, then using https://github.com/troglobit/toolbox/blob/master/mcjoin/mcjoin.c on the receiver to join the stream (often with repeat join every 10 sec) in the background and then tcpdump.

troglobit commented 8 years ago

Hmm, it's late here ... with question (2.) I of course mean to say that I hope the sender sets a high enough TTL.

You can also listen in on the PIM traffic between R2 and R3 to check if R3 sends PIM join for your group or not. It's not until R3 gets "accept" that it sets the routing rule in the kernel. At least that's what my fuzzy head tells me right now. (Off to sleep)

ocochard commented 8 years ago

Sender is 10.0.12.1 Receiver is 10.0.34.4

and there is an IP connectivity between them:

[root@sender]~# ping 10.0.34.4
PING 10.0.34.4 (10.0.34.4): 56 data bytes
64 bytes from 10.0.34.4: icmp_seq=0 ttl=62 time=2.001 ms
64 bytes from 10.0.34.4: icmp_seq=1 ttl=62 time=1.409 ms
64 bytes from 10.0.34.4: icmp_seq=2 ttl=62 time=1.244 ms
^C
--- 10.0.34.4 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 1.244/1.551/2.001/0.325 ms
[root@R4]~# mtest
multicast membership test program; enter ? for list of commands
?
j mcast-addr ifname [src-addr] - join IP multicast group
l mcast-addr ifname [src-addr] - leave IP multicast group
i mcast-addr ifname n          - set n include mode src filter
e mcast-addr ifname n          - set n exclude mode src filter
t mcast-addr ifname src-addr  - allow traffic from src
b mcast-addr ifname src-addr  - block traffic from src
g mcast-addr ifname n        - get and show n src filters
a ifname mac-addr          - add link multicast filter
d ifname mac-addr          - delete link multicast filter
m ifname 1/0               - set/clear ether allmulti flag
p ifname 1/0               - set/clear ether promisc flag
f filename                 - read command(s) from file
s seconds                  - sleep for some time
q                          - quit

Then my command "j 239.1.1.1 em2 10.0.12.1" ask for joining mcast group 239.1.1.1 using interface em2 (interface toward R3) with a source of 10.0.12.1. This is confirmed by a tcpdump on R4:

00:21:50.142406 IP (tos 0xc0, ttl 1, id 82, offset 0, flags [DF], proto IGMP (2), length 44, options (RA))
    10.0.34.4 > 224.0.0.22: igmp v3 report, 1 group record(s) [gaddr 239.1.1.1 allow { 10.0.12.1 }]

But a tcpdump between R2 and R3 show only "Hello" PIM message and nothing else:

00:21:54.522475 IP 10.0.23.2 > 224.0.0.13: PIMv2, Hello, length 26
00:21:55.524647 IP 10.0.23.3 > 224.0.0.13: PIMv2, Hello, length 26
00:22:23.815724 IP 10.0.23.2 > 224.0.0.13: PIMv2, Hello, length 26
00:22:24.815778 IP 10.0.23.3 > 224.0.0.13: PIMv2, Hello, length 26
00:22:53.638322 IP 10.0.23.2 > 224.0.0.13: PIMv2, Hello, length 26
00:22:54.861833 IP 10.0.23.3 > 224.0.0.13: PIMv2, Hello, length 26

Now I'm generating a ping with a high enough TTL from my sender:

[root@R1]~# ping -T 64 239.1.1.1
PING 239.1.1.1 (239.1.1.1): 56 data bytes

The R2 PIM router didn't update its routing table:

[root@R2]~# pimd -r
Virtual Interface Table ======================================================
Vif  Local Address    Subnet              Thresh  Flags      Neighbors
---  ---------------  ------------------  ------  ---------  -----------------
  0  10.0.12.2        10.0.12/24               1  DR NO-NBR
  1  10.0.23.2        10.0.23/24               1  PIM        10.0.23.3
  2  10.0.12.2        register_vif0            1

 Vif  SSM Group        Sources

Multicast Routing Table ======================================================
--------------------------------- (*,*,G) ------------------------------------
Number of Groups: 0
Number of Cache MIRRORs: 0
------------------------------------------------------------------------------
troglobit commented 8 years ago

@ocochard Yeah, that really does sound like a bug ... curious, you don't even see PIM Assert messages between the routers?

What if you remove the cand_rp and cand_bootstrap_router settings and run more of a vanilla setup? Would be neat if we could isolate the problem.

ocochard commented 8 years ago

Same problem with an "empty" vanilla setup: I don't see any PIM assert messages between routers.

troglobit commented 8 years ago

Ouch! :unamused:

I'll see what I can do to set up a couple of virtual FreeBSD routers and start looking myself. Hopefully sometime this weekend ... terribly sorry @ocochard I should have tested this better before the release!

troglobit commented 8 years ago

OK, I think I've now reproduced the same problem that you have reported. Nothing obvious to report atm. though.

troglobit commented 8 years ago

Managed to get it working, not exactly sure why, but if I set up a network with two FreeBSD 10.2 (64-bit) routers and two Ubuntu clients (sender/reciever). All running RIP, routed on FreeBSD, with ripv2 enabled, then

  1. inspect routing tables (netstat -rn to see RIP works)
  2. start the clients first (sender with ping and my receiver with mcjoin and tcpdump)
  3. start R2, wait a minute
  4. start R3
  5. Success ...

                            pimd                 pimd
    +----------+          +--------+           +--------+           +------------+
    |          |          |        |           |        |           |            |
    |  Sender  +----------+   R2   +-----------+   R3   +-----------+  Receiver  |
    |          |          |        |           |        |           |            |
    +----------+          +--------+           +--------+           +------------+
    Quagga RIP             routed               routed               Quagga RIP

However, if I start R3 first then it never seems to resolve the situation. R3 just sits there, receiving multicast but sending STOP messages to R2, despite correctly registering the IGMP join from the receiver :-/

troglobit commented 8 years ago

OK, now I can't even reproduce the problem anymore? I can start R2 or R3 first, with or without waiting in between ...

Possibly my initial problems were due to the virtual environment I'm running on. Despite running on a fairly modern Linux 4.2 (Ubuntu 15.10) host with the latest Qemu, I still have to disable IGMP snooping on the hosts' bridges. For details, see http://troglobit.com/multicast-howto.html under the heading "Roll your own cloud".

troglobit commented 8 years ago

I did an extremely simplistic writeup of what I did here http://troglobit.com/howto-run-pimd-on-freebsd.html. It's likely too high-level for you @ocochard, but I post it here anyway, maybe some of it can help ...

troglobit commented 8 years ago

Hmm, there do seem to be some remaining issues with ip_len and ntohs()/htons() conversion still lingering on FreeBSD … and some odd PIM_REGISTER formatting from the kernel?

I get a ton of "warning - Not a multicast addr...." in the logs, and when I go deeper, expand the logs a bit, then I see the source IP of Sender instead of the group it sends to ... in the encapsulated IP dst field from the FreeBSD kernel? In fact, the 48 byte packet header I receive contains no multicast group at all?

troglobit commented 8 years ago

Unfortunately I haven't had time to look into this any further, but @idismmxiv just posted a fix for issue #63 in pull request #64, which may also very well be related to your problems @ocochard ...

ocochard commented 8 years ago

Just compiled a pimd 2.3.1 with the pull request #64 patchs applied: But it didn't fix the regression.

On Tue, Dec 29, 2015 at 4:25 PM, Joachim Nilsson notifications@github.com wrote:

Unfortunately I haven't had time to look into this any further, but @idismmxiv https://github.com/idismmxiv just posted a fix for issue #63 https://github.com/troglobit/pimd/issues/63 in pull request #64 https://github.com/troglobit/pimd/pull/64, which may also very well be related to your problems @ocochard https://github.com/ocochard ...

— Reply to this email directly or view it on GitHub https://github.com/troglobit/pimd/issues/57#issuecomment-167810479.

troglobit commented 8 years ago

Thanks for checking @ocochard, I'm starting up my old testbed now. Need to get to the bottom of this!

troglobit commented 8 years ago

@ocochard OK, this took the better part of the whole day, but I may have found something now. If I take pimd v2.3.1 and change https://github.com/troglobit/pimd/blob/master/pim_proto.c#L952

#ifdef HAVE_IP_HDRINCL_BSD_ORDER

to

#if 0 //#ifdef HAVE_IP_HDRINCL_BSD_ORDER

I can get a three-in-a-row FreeBSD 10.2 (64-bit) setup to actually route multicast. It's definitely something with how the FreeBSD kernel does ntohs() for some sockets, but not all. I don't know yet if this is only FreeBSD or it applies to all BSD's? OK, maybe not OpenBSD, they seem to have run pure RAW sockets like Linux for quite some time now, but maybe a NetBSD issue as well.

Update: Actually, this seems to have been added when you and me were messing about with issue #23 ... most of the code added in 28bbf4e look correct, but maybe I added one or two ip_len byte-swaps too many? :worried:

troglobit commented 8 years ago

@ocochard If you ever find the time again, could you possibly check master as of commit 4e706d4 w.r.t. this issue. Thank you so much!

ocochard commented 8 years ago

I've tried but meet the same problem, this is why I wondering if the problem came from my set-up?

Here are what I did:

Then once compiled and builded my disk image, I've run them under a virtualbox lab (simulating Intel NIC because FreeBSD VirtIO's drivers didn't support multicast routing).

Same release as yours: [root@R3]~# uname -mr 10.2-RELEASE-p8 amd64 (I've tried with a 11-current too: Same problem).

I didn't forgot to enable MROUTING in my kernel config file: [root@R3]~# sysctl kern.conftxt | grep MRO options MROUTING

Good version of pimd: [root@R3]~# pimd --version pimd version 2.3.2-beta1

The receiver (directly behind R3) can reach the sender (10.0.12.1, behind R2): [root@receiver]~# ping 10.0.12.1 PING 10.0.12.1 (10.0.12.1): 56 data bytes 64 bytes from 10.0.12.1: icmp_seq=0 ttl=62 time=2.672 ms 64 bytes from 10.0.12.1: icmp_seq=1 ttl=62 time=1.637 ms ^C --- 10.0.12.1 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 1.637/2.155/2.672/0.518 ms I'm using iperf as receiver mcast software: [root@receiver]~# iperf -s -u -B 239.1.1.1 -i 1

Server listening on UDP port 5001
Binding to local address 239.1.1.1
Joining multicast group  239.1.1.1
Receiving 1470 byte datagrams
UDP buffer size: 41.1 KByte (default)
------------------------------------------------------------

But now, on R3 we correcly see the member ship report but didn't update its mcast routing table: 15:44:42.210 accept_group_report(): igmp_src 10.0.34.4 ssm_src 0.0.0.0 group 239.1.1.1 report_type 34 15:44:42.210 Set delete timer for group: 239.1.1.1 15:44:42.210 create group entry, group 239.1.1.1 15:44:42.210 Received IGMP v3 Membership Report from 10.0.34.4 to 224.0.0.22 15:44:42.210 accept_membership_report(): IGMP v3 report, 16 bytes, from 10.0.34.4 to 224.0.0.22 with 1 group records. 15:44:42.210 accept_group_report(): igmp_src 10.0.34.4 ssm_src 0.0.0.0 group 239.1.1.1 report_type 34 15:44:42.210 Set delete timer for group: 239.1.1.1 15:44:42.210 create group entry, group 239.1.1.1

Virtual Interface Table ======================================================
Vif  Local Address    Subnet              Thresh  Flags      Neighbors
---  ---------------  ------------------  ------  ---------  -----------------
  0  10.0.23.3        10.0.23/24               1  DR PIM     10.0.23.2
  1  10.0.34.3        10.0.34/24               1  DR NO-NBR
  2  10.0.23.3        register_vif0            1

 Vif  SSM Group        Sources

Multicast Routing Table ======================================================
--------------------------------- (*,*,G) ------------------------------------
Number of Groups: 0
Number of Cache MIRRORs: 0
------------------------------------------------------------------------------

Candidate Rendezvous-Point Set ===============================================
RP address       Incoming  Group Prefix        Priority  Holdtime
---------------  --------  ------------------  --------  ---------------------
------------------------------------------------------------------------------
Current BSR address: 0.0.0.0
troglobit commented 8 years ago

@ocochard unfortunately I've never tried virtual networking on a FreeBSD box, and never used virtualbox. But multicast and virtual environments have quite a shaky history ... I still have to do some tricks to get it to work on a Linux host.

I really don't know what more I can suggest you look at. Here are a few pointers I came to think of from reading your last report:

  1. Use git submodule update to get the libite GIT submodule updated, after the initial git submodule update --init
  2. When I built my own FreeBSD kernel I also had to add option PIM in addition to option MROUTING. Today I simply use the GENERIC kernel with the ip_mroute_load="yes" module instead, much easier for my simple tests -- shouldn't matter, pimd would not start up properly without PIM extensions in the kernel
  3. R3 will not add a mcast routing entry upon receiving an IGMP join, it takes a bit more magic first. For example a PIM Join will be sent from R3 to the Rendez-vous Point first.

Have you tried running just a single pimd router?

ocochard commented 8 years ago

I've found the problem: Now I reach to re-use pimd on FreeBSD 10.2 too.

my problem was this typo on my port Makefile:

MAKE_ARGS+=   prefix="${PREFIX}" sysconfdir="${PREFIX}/etc"

The good one is:

MAKE_ARGS+=   prefix="${PREFIX}" sysconfdir="${PREFIX}/etc/"

(notice the / at the end of sysconfdir).

I've seen the problem with by noticing theses lines in the debug output of pimd:

16:30:31.106 Getting vifs from /usr/local/etcpimd.conf
...
Current BSR address: 0.0.0.0

With an "empty" on unexisting pimd.conf, the BSR address was always 0.0.0.0 and it never add a rendez-vous point, then never update its mrouting table.

By fixing this typo, it correctly the configuration file now. My receiver is happy now (Virtualbox VM, promiscous mode enabled on virtual Intel interface). I've still a small problem: Receivers behind the second PIM router can take a long time before receiving mcast traffic. Receivers directly behind the first PIM router receive mcast traffic as soon as they subscribe.

troglobit commented 8 years ago

@ocochard Oh great news!! :smiley: :+1: :tada:

Nasty little thing with the missing slashes there, shouldn't happen in a project like this. I'll look into a check and fix. Also, the resulting issue of no BSR address is no good ... thank you for reporting this! I'll file separate issues for them later tonight.

I think the remaining problem you mention is issue #58, which I haven't had the time to dig into yet myself, but others have. You could attempt to verify the last comment by @am88b, who managed to get multicast forwarding within 0-4 sec!

ocochard commented 8 years ago

About the last problem: I've got the delay problem only with my candidate BSR / canditate RP setup. I don't meet the problem with static RP configured like in issue #58.

troglobit commented 8 years ago

@ocochard Aha, good to know when I eventually get to dig into it. Thank you!