nodemcu / nodemcu-firmware

Lua based interactive firmware for ESP8266, ESP8285 and ESP32
https://nodemcu.readthedocs.io
MIT License
7.64k stars 3.12k forks source link

Fix IGMP timer #3476

Closed stromnet closed 2 years ago

stromnet commented 2 years ago

Fixes #\<GitHub-issue-number>.

\<Description of and rationale behind this PR> When sending traffic from a non-local multicast sender, via an multicast router, it was only working sporadically. For some reason the router was dropping the subscriptions every now and then, and thus stopped forwarding the multicast traffic. I suspect this could affect L2-devices doing IGMP-snooping too.

The IGMP Queries had a max-response-timeout of 100, but in packet logs this was clearly not respected by the device. Sometimes minutes passed between reports. After a lot of digging and printfs (oh btw, os_printf does not work. had to hack around to redefine it to dbg_printf), I found root cause: LWIP_RAND() aka r_rand() is defined as int (well, in app/include/lwip/igmp.h it's actually not defined at all, in sdk/esp_iot_sdk_v3.0-e4434aa/third_party/include/lwip/igmp.h it IS defined. Not sure which one is used?).

Anyway, signed means negative. and % 100 of negative value, casted to u16_t is potentially > 100. Experiment here: https://godbolt.org/z/vzWGEqzzx

This fix ensures that we actually cap timer to max_time and thus reply within the routers expected interval.

Another fix is to ensure we define LWIP_RAND() as (u32), like defined here: https://www.nongnu.org/lwip/2_0_x/group__compiler__abstraction.html#ga77370c377781ee7489e30eaf772ea05a but.. again, not sure about all lwip files, what goes where and so on..

Upstream has a lot of other fixes, but not sure lwip and nodemcu lifecycles integrate.. https://git.savannah.nongnu.org/cgit/lwip.git/tree/src/core/ipv4/igmp.c#n692

marcelstoer commented 2 years ago

Thanks for the fix.

Upstream has a lot of other fixes, but not sure lwip and nodemcu lifecycles integrate..

Pops up every now and then, e.g. here #3040.