peterhinch / micropython-mqtt

A 'resilient' asynchronous MQTT driver. Recovers from WiFi and broker outages.
MIT License
549 stars 116 forks source link

First time power up connection failed due to weak wifi #134

Closed SimBo55699 closed 4 months ago

SimBo55699 commented 4 months ago

Really great work on this project, everything works fine except for one case: First time powering up due to weak wifi. When I put my PicoW closer to the router, 100% it worked when I powered up the device (plugged in USB). However, when the PicoW was away from the router (weak wifi signal) and powered up, it worked 50%, the problem is... it just quitted without re-trying and error out:

Connection failed.

I am using the sample code "range.py" to do the test and I could re-produce this again and again.

In "mqtt_as.py" line 607, I print the status:

                elif RP2:  # 1 is STAT_CONNECTING. 2 reported by user (No IP?)
                    print(f"status={s.status()}")
                    if not 1 <= s.status() <= 2:
                        break

status=1 status=1 status=1 status=-1 Connection failed.

status=-2 Connection failed.

Then, I kept unplugging and re-plugging it in, then eventually it worked:

Checking WiFi integrity. Got reliable connection Connecting to broker. Connected to broker.

In real life scenario, there is a use case. For example, during power outage (say both Router and PicoW don't have UPS Uninterruptible Power Supply), after some time power is restored. PicoW starts but Router hasn't able to provide the SSID or not working yet. This leads PicoW to quit.

I read the comment in the code in the MQTT_base class, quote "Handles MQTT protocol on the basis of a good connection, Exceptions from connectivity failures are handled by MQTTClient subclass." end quote, unfortunately I may not always have a good connection to start with, so I don't know what to do honestly. I have a stupid workaround to force it to re-try 20 times and added this piece of code before even calling the "mqtt_as" sample code. It seems to work. But this doesn't look like a pretty solution.... any help or comment is appreciated!

wlan = network.WLAN(network.STA_IF)
wlan.active(True)
wlan.config(pm = 0xa11140)
wlan.connect("ssid", "password")

max_wait = 20
while max_wait > 0:
    if wlan.status() < 0 or wlan.status() >= 3:        
        break
    max_wait -= 1
    time.sleep(1)

if wlan.isconnected():  
  # "mqtt_as" sample code goes in there
peterhinch commented 4 months ago

This behaviour is by design. Failures on power up typically have a cause which requires manual intervention: wrong broker address, wrong WiFi credentials etc. Consequently an exception is thrown. On reconnection it can be assumed that these values are correct. It can be assumed that repeated attempts will eventually succeed.

I therefore don't plan to change this behaviour.

If you want to experiment for your own use case, study this code, in particular ._has_connected. This bound variable causes initial connect to be different from subsequent connects. It's not as simple as forcing this to True as there are issues such as DNS lookup and clean session behaviour to consider. You don't want to to getaddrinfo() more than once as it's a blocking call.

Good luck :)

beetlegigg commented 4 months ago

@SimBo55699 I periodically have a bout of power cuts often lasting only seconds and usually something to do with wet tree branches brushing the overland power line which then get trimmed, but eventually the darn twigs grow again. And as, of course, on power being restored the rpi picoW fires up well before the router is back up I found out early on that the mqtt_as did not try to connect to wifi for a long enough period in this instance.

I simply establish a wifi connection first with a different function before calling the mqtt_as routines. The wifi connection loop tries to connect for 10 times, and if no success, it soft reboots and tries again. If there is a wifi connection signal to be found (eventually) this can be relied upon to make a connection. mqtt_as does not mind about a pre-connection being established.

I see you remark it doesn't seem a pretty solution, but I think 'the eye of the beholder' comes into this as I've always thought it a rather attractive resolution :-). But if you are going for something more sophisticated in modifying mqtt_as then good luck, your a braver man than me.

SimBo55699 commented 4 months ago

Thanks @peterhinch for your input. I think I am going to stick with my workaround for now, as @beetlegigg has similar solution which has proven to work! Feel free to close this issue.

peterhinch commented 4 months ago

At risk of stating the obvious, the application has the option of trapping the exception and running its own recovery method.

@beetlegigg Establishing WiFi first is an excellent solution.