martin-ger / esp_mqtt

MQTT Broker/Bridge on the ESP8266
MIT License
293 stars 68 forks source link

My MQTT Broker disconnects often. #33

Open jnherm opened 6 years ago

jnherm commented 6 years ago

Does anybody experience in your broker always disconnects all clients? All my clients were all disconnected sometimes.

martin-ger commented 6 years ago

Are your clients connected from via the AP interface or via the STA. The broker has to disconnect all clients when the uplink connection of the STA is (temporarily) lost.

jnherm commented 6 years ago

Hello Martin! My clients and broker are both connected to my home router AP.

martin-ger commented 6 years ago

Then I am pretty sure that you have a temporary disconnect of the ESP from the router AP - you should be able to see that on the serial console.

To verify that you might also run a script on the broker that does some action "on wifidisconnect", e.g. increase a counter variable. You than can log into the broker and look into "show vars" to see the value of the counter.

How to fix that?

jnherm commented 6 years ago

Ok Martin, Thank you for the advice. I will look into it. Update you when something comes up.

jnherm commented 6 years ago

Off topic Martin, I am using netcat for windows to upload script to the esp, Why is it that it would take about 5mins to upload the script regardless of the size of my script?

jnherm commented 6 years ago

Can you help me understand this log:

Waiting for script upload on port 2000 CMD>Fatal exception 0(IllegalInstructionCause): epc1=0x40217b7b, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x0 þ000000 ets Jan 8 2013,rst cause:1, boot mode:(3,6)

load 0x40100000, len 31244, room 16 tail 12 chksum 0x7d ho 0 tail 12 room 4 load 0x3ffe8000, len 2124, room 12 tail 0 chksum 0xf5 load 0x3ffe8850, len 11972, room 8 tail 12 chksum 0x6a csum 0x6a çI8‚Œò±¾C¡C¡U-T-$ª«W–®©*$(®k«–$•& Ò.W,.]Z·VH¨HhÔËËVë’Ö«—‹R,‹ë+VVH¨” •ªJR- Ò×,®Z\º Error ('init', 'mqtStarting Console TCP Server on port 7777 Max number of TCP clients: 15 mode : sta(ec:fa:bc:07:50:8e) add if0 mode: 0 -> 3 state: 2 ->@3 (0) state: 3 -> 5 (10) add 0 aid 2 cnt

connected with Balay24, channel 1 dhcp client start... connect to ssid Balay24, channel 1 ip:192.168.10.101,mask:255.255.255.0,gw:192.168.10.251 ip:192.168.10.101,mask:255.255.255.0,gw:192.168.10.251,dns:192.168.10.251 pm open,type:2 0 Got NTP server: 129.250.35.251 NTP synced

jnherm commented 6 years ago

Martin, what is wrong with this snippet?


% When wifi disconnects on wifidisconnect do println "Wifi Disconnected" setvar $wifiDisc = @3 + 1 setvar @3 = $wifiDisc println "Wifi Disconnected : " | $wifiDisc | " times" % When wifi connected on wificonnect do println "Wifi Connected" setvar $wifiCon = @4 + 1 setvar @4 = $wifiCon println "Wifi Connected : " | $wifiCon | " times"


I got error:

Script upload completed (1387 Bytes) Error ('init', 'mqttconnect', 'topic', 'gpio_interrupt', 'serial', 'alarm', 'htt p_response', or 'timer' expected) at >>do =

martin-ger commented 6 years ago

So you have the latest version. Added wifidisconnect recently.

Am 1. März 2018 6:23:07 nachm. schrieb jnherm notifications@github.com:

Martin, what is wrong with this snippet?


% When wifi disconnects on wifidisconnect do println "Wifi Disconnected" setvar $wifiDisc = @3 + 1 setvar @3 = $wifiDisc println "Wifi Disconnected : " | $wifiDisc | " times" % When wifi connected on wificonnect do println "Wifi Connected" setvar $wifiCon = @4 + 1 setvar @4 = $wifiCon println "Wifi Connected : " | $wifiCon | " times"


I got error:

Script upload completed (1387 Bytes) Error ('init', 'mqttconnect', 'topic', 'gpio_interrupt', 'serial', 'alarm', 'htt p_response', or 'timer' expected) at >>do =

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/martin-ger/esp_mqtt/issues/33#issuecomment-369663863

martin-ger commented 6 years ago

Just tested it: the script is fine with the lastest build.

No, it is not normal, that it takes 5 min - should work immediately. However, never tested it with Windows. What netcat do you use? Some sort of connection problems?

Your trace is obviously a reboot. This happened while loading a script? If so, are you able to reproduce it with the same script? If so, I would be interested in this script!

jnherm commented 6 years ago

hello Martin, Thanks for your help. I just upgraded my firmware and it work. I can now monitor my wifi connection. But one thing I observe is "on wifidisconnect" will execute everytime the esp tried to connect to my router. That is if my router is off, the counter keep on increasing even though it was previously disconnected.

Regarding my netcat version for windows, I just downloaded if from github, by Diego Casorran.

This is my Final Script that I used:

% Config params, overwrite any previous settings from the commandline config ap_ssid MQTTBROKER2 config ap_password
config ntp_server 1.de.pool.ntp.org config broker_user
config broker_password
config speed 80 % Now the initialization, this is done once after booting on init do

% @<num> vars are stored in flash and are persistent even after reboot 
setvar $run = @2 + 1
setvar @2 = $run
println "This is reboot no "|$run
setvar $relay_status = 0
gpio_out 12 $relay_status
setvar $command_topicX = "cmnd/sonoff/03/POWER1"
setvar $command_topicY = "cmnd/sonoff/03/POWER2"

% The local pushbutton on gpio_interrupt 0 pullup do println "New state GPIO 0: " | $this_gpio if $this_gpio = 0 then

    gpio_out 13 not ($relay_status)
    publish local $command_topicX $relay_status retained
    publish local $command_topicY $relay_status retained
    if $relay_status = 0 then
        setvar $relay_status = 1
    else
        setvar $relay_status = 0
    endif

endif

% When wifi disconnects on wifidisconnect do println "Wifi Disconnected on " | $timestamp setvar $wifiDisc = @3 + 1 setvar @3 = $wifiDisc println "Wifi Disconnected : " | $wifiDisc | " times"

% When wifi connected on wificonnect do println "Wifi Connected on " | $timestamp setvar $wifiCon = @4 + 1 setvar @4 = $wifiCon println "Wifi Connected : " | $wifiCon | " times"

jnherm commented 6 years ago

Can you suggest netcat version for windows? Or any method that will be easier for my to upload script?

martin-ger commented 6 years ago

You could try this local web server ( http://fenixwebserver.com/ ) on your windows machine and use the "pull"-mode.

jnherm commented 6 years ago

Hello Martin! It doesn't seems to work from my end. I tried to use fenix webserver but when I "pull" the script from fenix webserver to my sonoff device via serial this is what happend:

HTTP request to http://127.0.0.1:81/sonoffScript2.txt started

Then nothing happens(i waited for 5 to 10mins), but when I press enter, this is the message:

HTTP script upload failed (error code -1).

I also tried "pull" request via internet. I uploaded my script to my google drive and get the link. But again this shows:

client handshake start. client handshake ok!

After that, I waited but nothing happens. The when I press enter key, this is the message:

HTTP script upload failed (error code 302)

jnherm commented 6 years ago

Hello Martin, I downloaded Android App webserver and it work great. Just after the request, ,my script were downloaded to esp device. I also tried this webserver that works with windows (https://sourceforge.net/projects/miniweb/files/), the result was great.

martin-ger commented 6 years ago

Good to read!

BTW: I guest 127.0.0.1 was the wrong address for the fenix - this is "localhost" you will need the actual IP of your PC in the local net.

jnherm commented 6 years ago

127.0.0.1 was the host ip given by fenix... I tried also localhost, but to no avail. I tried to use my browser and the text were displayed when I use 127.0.0.1.

martin-ger commented 6 years ago

I am currently running a setup where also the MQTT connection to the uMQTTBroker is interrupted and than immediatly reestablished by the clients. At least the wireshark trace tells me, that the CLIENTs (using tuampmt's original lib) actively disconnect, not the broker. Up to know I don't know why...

jnherm commented 6 years ago

I also observe the number of disconnections on my setup but I noticed that even the esp MQTT broker did not disconnect from my router, the clients tried to reconnect, at least based on the log of the clients. By the way I am using itead sonoff devices with Tasmota firmware. I also posted an issue on tasmota but unfortunately no one is responding positively to my issue. My issue with tasmota firmware is that whenever it gets reconnected with the MQTT broker it will randomly change my relay status. Sometimes it will toggle, sometime it will turn OFF or turn ON the relay.

So, I think your broker is stable. Even if it sometimes disconnect from the router, it will alway reconnects.

jnherm commented 6 years ago

Martin, sorry to bother you again. I observe that when I reconnect my sonoff to your MQTT broker, after subscriptions of topic, your MQTT broker will try to publish(maybe test publish) the topic just subscribed. Is my observation correct? Other MQTT broker I used won't do so. Is it possible that you could have a setting for that so that user can choose if they want to test the subscribed topic or not.

martin-ger commented 6 years ago

This is not an intended test publication. I would guess, it is a 'retained' topic (a flag in publication), e.g. one that has to be stored by the broker. Could you check this by looking into the output of 'show mqtt'. It shows all retained topics.

Am 6. März 2018 6:28:39 vorm. schrieb jnherm notifications@github.com:

Martin, sorry to bother you again. I observe that when I reconnect my sonoff to your MQTT broker, after subscriptions of topic, your MQTT broker will try to publish(maybe test publish) the topic just subscribed. Is my observation correct? Other MQTT broker I used won't do so. Is it possible that you could have a setting for that so that user can choose if they want to test the subscribed topic or not.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/martin-ger/esp_mqtt/issues/33#issuecomment-370668036

jnherm commented 6 years ago

Yes there are subscriptions that are 'retained', but the same subscriptions did not cause the other MQTT broker that I am using to send a published MQTT command to my devicce. On the other hand, your broker will to send those subscribed command to my devices after re-connection.

martin-ger commented 6 years ago

I think, this behavior is exactly conforming to the specs: https://www.hivemq.com/blog/mqtt-essentials-part-8-retained-messages

A retained message is a normal MQTT message with the retained flag set to true. The broker will store the last retained message and the corresponding QoS for that topic Each client that subscribes to a topic pattern, which matches the topic of the retained message, will receive the message immediately after subscribing. For each topic only one retained message will be stored by the broker.

What is your reference broker? Mosquitto?

jnherm commented 6 years ago

Furthermore, if it is due to 'retained' subscriptions, your broker should send the last published topic's payload, for which my device will get a payload that is the same with previous relay state. But in my case, my devices will get a consistent payload. This is my scenario:

Client A lost MQTT connection with Broker ESP: (Client Relay is ON) Cliens A will reconnect to Broker ESP every 10sec If Broker ESP is now connected to wifi, Client A will have successful connection after next 10sec. Client A then receives MQTT command from Broker ESP with Payload OFF Client A will turn OFF the relay.

Note that if previous client A's relay state is OFF before re-connection, relay will remain OFF since Payload is OFF.

martin-ger commented 6 years ago

Don`t understand why the last state that is send after resubscription should be OFF? When it lost connection when switched ON, the retained state should be ON? It's not?

jnherm commented 6 years ago

I am using this Android App as a broker. I don't know if it is Mosquitto. https://play.google.com/store/apps/details?id=server.com.mqtt

martin-ger commented 6 years ago

I have a guess: is it possible that you have a restart of the broker? Then this could make sense: if the retained state is saved in flash, it will be constant after restart. This would also explain the connection loss.

What kind of ESP are you using? What about power supply?

jnherm commented 6 years ago

I really prefer your ESP broker because it is the most economical for a small/lightweight application like controlling a room.

jnherm commented 6 years ago

I am using ITEAD's sonoff basic as ESP broker. Power supply is SMPS inside sonoff basic connected directly to mains.

jnherm commented 6 years ago

"Dont understand why the last state that is send after resubscription should be OFF? When it lost connection when switched ON, the retained state should be ON? It'snot?"

Yes you are right, It should be ON. I have other scenarios which toggle my relay, no matter what is my previous relay state. Maybe ESP broker saved "toggle" payload

jnherm commented 6 years ago

I have a guess: is it possible that you have a restart of the broker?

I simulate lost connection by turning Off then ON the ESP broker.

martin-ger commented 6 years ago

Okay, then the behavior is clear: you have a saved state in flash: OFF. If you reset the broker (on/off), it will restart with this state from flash. Two options:

jnherm commented 6 years ago

Thank you Martin.

jnherm commented 6 years ago

Good news Martin. With autoretain set to 1 my problem solved! Thank you again!