arduino / ArduinoCore-mbed

347 stars 202 forks source link

UDP Multicast crashes ethernet interface when cable is disconnected and reconnected #904

Closed Channel59 closed 4 months ago

Channel59 commented 4 months ago

Dear Community,

During testing the Opta's resillience to switching network interfaces from a switch/host pc/router, which the opta can encounter in a production environment, an issue that seems to be related to UDP Multicast has come to light when testing a scenario where an mDNS responder was running on the Opta.

When running a simple program like this:

#include <Arduino.h>
#include "EthernetUdp.h"
#include "EthernetInterface.h"
#include "mbed.h"

auto net = EthernetInterface::get_default_instance();
EthernetUDP udp;
IPAddress multicastIp("239.255.0.1");

void setup(){
    SocketAddress ip("192.168.0.100");
    SocketAddress netmask("255.255.255.0");
    SocketAddress gateway("192.168.0.1");

    net->set_blocking(true);
    net->set_network(ip, netmask, gateway);
    net->set_dhcp(false);
    net->connect();
    }

void loop() {
    if (net->get_connection_status() == NSAPI_STATUS_GLOBAL_UP){
    Serial.println("Sending message");
    char packetBuffer[] = "Hello, Server!";
    udp.beginMulticast(multicastIp, 1567);
    udp.beginPacket(multicastIp, 1567);
    udp.write(packetBuffer);
    udp.endPacket();

    rtos::ThisThread::sleep_for(1000);

    int packetSize = udp.parsePacket();
    if (packetSize) {
        // Read the packet into the buffer
        char replyBuffer[256];
        int len = udp.read(replyBuffer, 255);
    }
    // udp.stop();
    }
    rtos::ThisThread::sleep_for(1000);

}

And bringing the ethernet interface from the switch/router/host pc up and down like this bash script:

#!/bin/bash

opta=192.168.0.100
eth=eth0

for i in {1..2000}
do
        for j in {1..5}
        do
                t_down=$((500 + $RANDOM % 5000))
                t_up=$((1000 + $RANDOM % 2000))
                echo -ne "_"
                ifconfig $eth down
                ./usleep $(($t_down * 1000))
                echo -ne "\b-"
                ifconfig $eth up
                ./usleep $(($t_up * 1000))
                echo -ne "\b."
        done

        if (( $t_up<3000 )); then
                t_wait=$((3000 - t_up))
                ./usleep $(($t_wait * 1000))
        fi
    sleep 2
        if ping -c 1 $opta &> /dev/null
        then
                echo -n "|"
        else
                echo "X"
                exit 1
        fi
done

(Use e.g. this for usleep: https://github.com/pklaus/usleep-binary)

The Opta can come in a state where it thinks it is connected, NSAPI_STATUS_GLOBAL_UP, but it is not responsive to pings and packets do not get sent out.

If it comes in this state, often bringing the ethernet interface from the switch/router/host pc down and up resolves the issue.

However, this is not always desirable or possible, leaving the Opta in an unresponsive state.

A different way to reproduce this problem is to apply this patch: https://github.com/arduino/ArduinoCore-mbed/pull/902, enable DHCP and just wait (like the snippet here: https://github.com/arduino/ArduinoCore-mbed/issues/891#issuecomment-2181547003), and run the shell script (or simply unplug and re-plug the ethernet after 3 seconds). The reason this works to reproduce the problem is i guess because DHCP also uses UDP Multicast under the hood.

maidnl commented 4 months ago

This issue is then closes thanks to commit 97304bffff1852de6ebf10ccbb941f431b70b16