Open t8y8 opened 1 year ago
Thanks for the detailed report. I tried to reproduce the issue but could not:
inet_aton(ip)
instead of the hard-coded IP address, and the sender fixed to use socket.inet_aton("0.0.0.0")
instead of the hard-coded IP address.mpremote run receiver.py
.while true; do python3 sender.py; done
.mpremote run receiver.py
.Thanks for the prompt reply!
I tweaked the receiver.py
(I call it multi.py
) to match, but I can't use 0.0.0.0
on Windows or it gets cranky.
I repeated the experiment using mpremote
and recorded the results.
They are the same sequence of working > receive 5ish > no longer working.
https://user-images.githubusercontent.com/4370533/233274683-956f2405-c145-4475-a6b4-c3887c77eb3a.mp4
I am also attaching a packet capture, filtered to packets from the pi or from windows to the multicast group. I'm not an expert here but it seems to work fine. (Note the video and the packet capture are different attempts so times won't match) capture.zip
Given that I can receive direct packets to the pi on that port, it feels like the issue is somewhere in the multicast part of lwip
or similar. On a power cycle when everything works, it takes about 4 seconds to connect to wifi and sends a group join IGMP packet. On a restart of the script without a power cycle it connects to wifi instantly and sends no group-join packet, despite in theory being a new socket with the setsockopt calls.
I also tried a few things to narrow down the environment:
sender.py
from the raspi to rule out Windows acting upPlease let me know what other data I can provide. I can pull and files or logs if given instructions, since I can reproduce this at will.
One thing I didn't mention which may be important is a change to the receiver script in the connect()
function. I have separate scripts that connect a board to the local WiFi and use that once at the start of my session, so then test scripts themselves don't need to connect (or know ssid/password).
The connect function for the above receiver test looks like this in my case:
def connect():
#Connect to WLAN
wlan = network.WLAN(network.STA_IF)
ip = wlan.ifconfig()[0]
print(f'Connected on {ip}')
return ip
Note that the WiFi connection will be retained over a soft reset, so you can connect manually at the REPL (or with a separate mpremote run connect.py
script, for example) and then run the test after connecting to the WiFi.
Can you please try that and see if the problem persists?
Woohoo! That appears to be the problem!
I created a connect.py
file that connects to wifi and moved it out of the script I start and stop.
Running multi.py
from Thonny repeatedly with CTRL-C or Stop and Start all worked fine, with all packets received.
I then minimally added bits of wifi initialization back in:
wlan.active(True)
-> no problemwlan.ifconfig()
-> no problemwlan.connect()
-> issue reproducesSo something happens when calling connect
if there's already an active wifi connection.
As a workaround I can split connect out, but I do think the behavior is unexpected and is either a subtle bug or deserves supporting documentation. I can see lots of folks just connecting to wifi at the beginning of a main.py
and relying on that after resets or coming out of sleep or something.
Thanks for testing and confirming where the issue lies.
This may take some time to fix properly, so for now please use the workaround of only connecting once.
@t8y8 This may be relevant. The module mqtt-as is designed to recover from outages to any of WiFi, broker, and internet connectivity. We found that the most reliable way to resume after any outage was explicitly to disconnect from WiFi and then reconnect.
Issue Description
When testing sending multicast packets back and forth between my Pi Pico W and my Windows 10 machine, the Pico will initially work, receiving packets sent to the same multicast group until you CTRL-C, hit stop, or leave it sitting a while. Eventually you will get into a state where it doesn't receive the packets anymore, though it is still listening and presumably still a member of the multicast group.
After this point, the only way to get things working again is to
machine.reset()
or physically power cycle the device. Once you do, things will work again. I've confirmed the packets are all sending via wireshark even after the Pico stops receiving things.I found https://github.com/micropython/micropython/issues/10812 which looks similar, but is resolved by using SO_REUSEADDR, which I am doing, but still have the issue.
The most reliable way to reproduce:
sock.close()
in the repl and make a new socketPico W Code (Receiver)
Windows Code (Sender)
Hardware