troglobit / pimd

PIM-SM/SSM multicast routing for UNIX and Linux
http://troglobit.com/projects/pimd/
BSD 3-Clause "New" or "Revised" License
197 stars 87 forks source link

IGMP "Other querier present timer" not implemented #35

Closed pchri03 closed 9 years ago

pchri03 commented 9 years ago

It seems that the "other querier present timer" is not implemented in the IGMP code (See RFC 2236 section 8.5, and section 7). The consequence hereof is that pimd does not resume as a querier if another querier with a lower IP address has been present in the past. With IGMP snooping switches without inbuilt queriers, this may break multicast altogether on the network.

Essentially, a non-querier IGMP router should listen for general queries from other queries, and restart the other querier present timer when it hears one. If the timer times out (other-querier-present-interval = robustness * query-interval + query-response-interval / 2 = 2 * 125 + 10 / 2 = 255 seconds), the IGMP router should switch to querier state and immediate transmit a general query. If it fails to do so, the multicast memberships expires 5 seconds after (group-membership-interval = robustness * query-interval + query-response-interval = 2 * 125 + 10 = 260 seconds).

troglobit commented 9 years ago

True, that's one of several features missing. Do you have a patch? :)

pchri03 commented 9 years ago

Not at the moment, no. But now that I've finally found the root cause of what caused multicast to fail in our data centers, I can probably prioritize making a patch soon.

troglobit commented 9 years ago

I just noticed that my other project, mrouted, which is the base used by the original pimd authors, probably has what we want:

if (v->uv_querier &&
    (v->uv_querier->al_timer += TIMER_INTERVAL) >
    IGMP_OTHER_QUERIER_PRESENT_INTERVAL) {
    /*
     * The current querier has timed out.  We must become the
     * querier.
     */
    IF_DEBUG(DEBUG_IGMP) {
    logit(LOG_DEBUG, 0, "Querier %s timed out",
          inet_fmt(v->uv_querier->al_addr, s1, sizeof(s1)));
    }
    free(v->uv_querier);
    v->uv_querier = NULL;
    v->uv_flags |= VIFF_QUERIER;
    send_query(v);
}

So it should be fairly easy to port it to pimd. The layout of the is a bit different, this piece of code was found in mrouted/vif.c, whereas pimd locates the IGMP stuff in pimd/igmp_proto.c

troglobit commented 9 years ago

Huh, that was almost too easy.

Just ported the above code to cc03559 on branch igmp-querier-timeout and a very basic test seems to indicate that it works! If possible, could you test this @pchri03 before I merge to master for release?

I'd of course like to have this, and #31 configurable via pimd.conf, if I get another couple of minutes to spare I might whip some support up for that too before I punch out a release.

troglobit commented 9 years ago

Got a few hours off tonight, so now I've also implemented rudimentary configuration of IGMP query interval and querier timeout (global settings only). See #31 for details, on the same branch as the timeout fix.

pchri03 commented 9 years ago

Nice! I tested the code in a virtual environment and it worked as expected.

I'll probably attempt to deploy a build with the changes to our multicast routers tomorrow night.

troglobit commented 9 years ago

Thanks for verifying the fix, much appreciated! :)

Reopening the issue, however. I like to keep issues open until the fix is in the master (release) branch.

troglobit commented 9 years ago

Merged to master, closing.