arendst / Tasmota

Alternative firmware for ESP8266 and ESP32 based devices with easy configuration using webUI, OTA updates, automation using timers or rules, expandability and entirely local control over MQTT, HTTP, Serial or KNX. Full documentation at
https://tasmota.github.io/docs
GNU General Public License v3.0
22.15k stars 4.8k forks source link

Device disconnecting from mqtt every 60 seconds, even after reflashing. #6816

Closed cbullo closed 4 years ago

cbullo commented 4 years ago

BUG DESCRIPTION

I have accidentally (sic!) flashed a perfectly working sonoff device with the newest firmware. Before it worked perfectly and the connection was super stable. Now after flashing the latest firmware it disconnects from mqtt server exactly every 60 seconds. I have followed the steps from the FAQ MQTT troubleshooting without any change.

FAILURE TO COMPLETE THE REQUESTED INFORMATION WILL RESULT IN YOUR ISSUE BEING CLOSED

TO REPRODUCE

Connect to Mosquitto MQTT server on my raspberry pi.

EXPECTED BEHAVIOUR

A clear and concise description of what you expected to happen. Not disconnecting from MQTT server

SCREENSHOTS

If applicable, add screenshots to help explain your problem.

ADDITIONAL CONTEXT

The same is happening with my generic esp 01 module that I flashed with the same firmware.

And these are messages received on my MQTT:

myhome/livingroom/sonoff/LWT Offline
myhome/livingroom/sonoff/LWT Online
myhome/livingroom/sonoff/cmnd/POWER (null)
myhome/livingroom/sonoff/RESULT {"POWER":"OFF"}
myhome/livingroom/sonoff/POWER OFF
myhome/livingroom/sonoff/LWT Offline
myhome/livingroom/sonoff/LWT Online
myhome/livingroom/sonoff/cmnd/POWER (null)
myhome/livingroom/sonoff/RESULT {"POWER":"OFF"}                                                                        myhome/livingroom/sonoff/POWER OFF
myhome/livingroom/sonoff/LWT Offline
myhome/livingroom/sonoff/LWT Online
myhome/livingroom/sonoff/cmnd/POWER (null)
myhome/livingroom/sonoff/RESULT {"POWER":"OFF"}
myhome/livingroom/sonoff/POWER OFF

(Please, remember to close the issue when the problem has been addressed)

Jason2866 commented 4 years ago

You have to search in your setup -2: MQTT_CONNECT_FAILED - the network connection failed Could be this https://github.com/arendst/Tasmota/wiki/FAQ#Frequent-MQTT-reconnects

cbullo commented 4 years ago

@Jason2866 Which setup? I followed the steps from FAQ, no change.

Jason2866 commented 4 years ago

Every device needs a specific topic. Read the mqtt part in wiki

cbullo commented 4 years ago

@Jason2866 I have only one device connected right now.

cbullo commented 4 years ago

Also this is repeating in my mosquitto log:

1572703634: Received PUBLISH from DVES_8D972D (d0, q0, r0, m0, 'myhome/livingroom/sonoff/RESULT', ... (15 bytes))
1572703635: Received PUBLISH from DVES_8D972D (d0, q0, r0, m0, 'myhome/livingroom/sonoff/POWER', ... (3 bytes))
1572703635: Sending PUBLISH to paho791735919053177 (d0, q0, r0, m0, 'myhome/livingroom/sonoff/POWER', ... (3 bytes))
1572703642: Received PINGREQ from mqtt_df1e806b.bd9f8
1572703642: Sending PINGRESP to mqtt_df1e806b.bd9f8
1572703664: Received PINGREQ from DVES_8D972D
1572703664: Sending PINGRESP to DVES_8D972D
1572703682: Socket error on client DVES_8D972D, disconnecting.
Jason2866 commented 4 years ago

As i said you have a config error somewhere 1572703682: Socket error on client DVES_8D972D, disconnecting.

cbullo commented 4 years ago

@Jason2866 But what kind of error? And as I wrote, it worked correctly before I flashed sonoff with newer firmware.

Jason2866 commented 4 years ago

Sorry, i dont know what error. You have to do research. Start from scratch First erase ESP8266 with esptool.py. Download Tasmota again. Flash this fresh copy. It is possible that you had a bad flash

cbullo commented 4 years ago

I did erase ESP and flashed a fresh copy of tasmota, as I described in bug report. That didn't change anything.

I have now captured network traffic between sonoff (192.168.0.36) and MQTT server (192.168.0.10). I'm not an expert, but it's sonoff that's sending TCP RST flag, which means it closed its socket.

image

And it's happening exactly every 60 seconds.

Jason2866 commented 4 years ago

Do you have WMM enabled in your AP / Router? Search for "special" settings in your wifi. Have you tried to restart the Pi? I had sometime ago a weird issue with mosquitto...

cbullo commented 4 years ago

No WMM or any other special settings in router. Restarting doesn't help.

edgarveersel commented 4 years ago

Same issues. No changes made, but updating Tasmota from 6.5.0 to 6.7.1

Jason2866 commented 4 years ago

Which router do you have? Have you rebooted mqtt broker and updated to latest version?

cbullo commented 4 years ago

@edgarveersel Thanks for the tip. Downgrading Tasmota on sonoff to 6.5.0 solved the issue for me :)

lhaperen commented 4 years ago

Same issue here after updating to 6.7.1 before every thing workt perfect. After downgrade to 6.5.0 it was stable again.

arendst commented 4 years ago

Keepalive was changed in 6.6.0 from 10 to 30 seconds. As mosquitto will drop connection if keepalive is missed within 60 seconds a bad tcp connection will likely miss a keepalive ping resulting in exceeding mosquitto's window.

cbullo commented 4 years ago

@arendst there are a few clues suggesting this is not due to poor WiFi signal. In the Wireshark log I posted above you can see that sonoff had received ping response around time 110.9 and then disconnection happened around time 138.0. Also my sonoff stands 10cm from the router and I have no trouble with any other wifi devices on the network.

Jason2866 commented 4 years ago

@cbullo Mhh, your setup has to differ from 100.000 other installations. There are very very very few issues with latest v.6.7.1.
All of this few issues arrived in Discord where at the end:

btw. There are Sonoff Basic which are known to have bad power supply and bad flash chips. If i where you i would just replace the Sonoff Basics behaving weird. A Sonoff Basic is a cheap device in every aspect. Do not expect high reliability from every bought one

cbullo commented 4 years ago

@Jason2866 I have reflashed multiple times, my wifi and mqtt settings are very simple (basically default), I have provided multiple logs with reasoning. I have tried on one sonoff device and two different esp 01 modules, all exposing the same behavior. I have downgraded firmware and verified that this solves issue without any other changes to my setup. There are two other people in this thread with the same behavior.

I'm not arguing, just stating the facts.

And I'm pretty sure it's something in esp core >= 2.4.0 that's causing trouble, as I wrote a custom firmware (so, not using Tasmota) using pubsubclient and esp core 2.4.0 for my other ESP 01 module and saw similar disconnections.

Jason2866 commented 4 years ago

@cbullo IF the error is coming from core nothing can be done from Tasmota side. Open a issue on https://github.com/esp8266/Arduino with a sample sketch producing the error. Core 2.4.0 is known to have many bugs/issues There are many fixes since core 2.4.2 for wifi reliability.

arendst commented 4 years ago

You might want to try the below option:

Add #define MQTT_CLEAN_SESSION 0 to file my_user_config.h around line 267 and recompile.

jj-uk commented 4 years ago

Is this a valid hostname?

{"StatusNET":{"Hostname":"myhome/livingroom/sonoff-5933","IPAddress":"192.168.0.36",

cbullo commented 4 years ago

@jj-uk I left it at default %s-%04d. Works correctly in 6.5. didn't have chance to try changing it in 6.7.1

jj-uk commented 4 years ago

@jj-uk I left it at default %s-%04d. Works correctly in 6.5. didn't have chance to try changing it in 6.7.1

'myhome/livingroom/sonoff-5933' is not the default. Forward slashes are not valid in host names.

cbullo commented 4 years ago

@jj-uk, No, but %s-%04d is default. And that's what I left it at.

localhost61 commented 4 years ago

@cbullo nevertheless it seems to be your issue, fix it in the WebUI WiFi config and you'll be done.

localhost61 commented 4 years ago

I understand your error now. Tested with latest 7.0.0.3, that's right that in the WebUI WiFi config if you clear the Hostname field and validate, it will be replaced with the string %s-%04d but by default, it should be the evaluated result of that string, that is sonoff-5933. %s-%04d in the field will result with the Topic value to be appended in leading position to the default hostname. Hence the result.

Jason2866 commented 4 years ago

So it is a case of https://github.com/arendst/Tasmota/issues/6816#issuecomment-549295528

jj-uk commented 4 years ago

@cbullo Set the hostname to "sonoff-%04x" and try again. As a test. See if that fixes your issue.

localhost61 commented 4 years ago

When I did it myself the MQTT connection remained stable, then it's not the only reason of the disturbance. @Jason2866 Nevertheless it remains a bug in Tasmota. Clearing the field shouldn't result in such an unexpected result.

jj-uk commented 4 years ago

Should the 'invalid hostname' bug be reported as a separate issue? I don't think tasmota has enough memory to do full validation of input.

localhost61 commented 4 years ago

@jj-uk no, IMHO it's clearly a bug, the user clear a field and a new value appears which is wrong, it's a deliberate behavior of Tasmota which is erroneous and should be corrected (but an empty field is wrong too ;-) ) .

jj-uk commented 4 years ago

I guess tasmota doesn't expect the topic to be anything other than a single word, like "my_topic". Are you going to open a new issue, since you can reproduce it?

@cbullo - Does this change the behaviour when the hostname is valid, e.g. "my_hostname" ?

cbullo commented 4 years ago

I won't be able to test it until the weekend, likely. Sorry.

Jason2866 commented 4 years ago

Tasmota has not the resources to check for valid entrys. A empty field check could be done But checking for a valid topic would increase code... ESP is a very resource limited device. Dont expect logic checks.

Jason2866 commented 4 years ago

Your problem is solved. Inputs for mqtt topics needs to be valid Please close. Thx!

localhost61 commented 4 years ago

@Jason2866 it's not necessary the origin of the disturbance, locally I got this weird hostname and it as no effect on the MQTT connection. Some hours ago I even issued a restart 1 command and no MQTT disconnection at all since then. So don't consider it as solved.

localhost61 commented 4 years ago

Honestly I don't know how I could achieve this behavior and I can't reproduce, because for the field change to be recorded one need to save the configuration and it triggers a restart. In accordance with the (wiki) documentation, when a '%' is found in the field this one is replaced by the default value which is properly evaluated. When I did it the device didn't restart but now it does... :-/

cbullo commented 4 years ago

I have now corrected hostname and mqtt topics. It didn't change anything for me, still same reconnections. My router is Sagemcom FAST3890V2.

Jason2866 commented 4 years ago

You can try this version. Uses a different (newer) sdk 2.2.2 version Has a command to change TX wifi transmit power wifipower <x> x from 0 to 20.5 Higher value higher output. Try value 16 firmware.zip

ascillato2 commented 4 years ago

@cbullo

Hi, any news on this?

cbullo commented 4 years ago

@ascillato2 not yet, I haven't had time to try it.

cbullo commented 4 years ago

No change with the new version and increased wifipower.

Jason2866 commented 4 years ago

So it is not a wifi power core issue. I am still thinking it is a sort of config issue -> EXACTLY 60 seconds! Can you try with a different router or accesspoint?

nejc-cc commented 4 years ago

I had a similar problem some time ago. I had to ditch whole Mosquitto broker and reinstall and reconfigure it. I have no idea what went wrong but I lost a couple of days before trying this. I'm using hass.io so it was quite easy for me to do that at the time :)

cbullo commented 4 years ago

@Jason2866 Unfortunately, I don't have any to try. I'm fine using version 6.5.0 which works great. If I ever need to upgrade for some reason, I'll try to debug it more with some extra logging in the source. But for now I really don't have more time to do it. So feel free to close this ticket, unless you have any other idea what might be going on. And I want to say that I think this is really a great project.

ascillato2 commented 4 years ago

for now I really don't have more time to do it.

No problem. Ask to reopen if you want to continue this, or address this to the Tasmota Support Chat. Thanks.

bolislav2 commented 3 years ago

Hello there! This is really a problem. Sorry for my English, it's not my native language. I ran into this problem and Google brought me to this page To fix the problem I was doing: 1. Changed the Esp8266 module. 2. Changed the router 3. Changed the firmware. The problem persisted. Everything works well on the old kernel version tasmota 6.5.0. the Problem is the time of the DHCP lease on the router. It is 60 seconds. After updating the lease, the connection is broken as at the beginning of this branch . I increased the rental time by 72 hours and the problem disappeared. I hope I have reduced the time to solve this problem. Thanks.

jcrespoc commented 3 years ago

the Problem is the time of the DHCP lease on the router. It is 60 seconds. After updating the lease, the connection is broken as at the beginning of this branch . I increased the rental time by 72 hours and the problem disappeared. I hope I have reduced the time to solve this problem. Thanks.

Thanks so much. Hard to figure out without you note. My router does not allow to configure lease time but setting up static ip address solves the MQTT constant reconnection issue. Waiting for a future fix on ESP8266 network stack.

qcrist commented 3 years ago

@ascillato2 I seem to be having this exact issue on my devices. I have had this issue on all firmware versions since 6.6.0. I am currently on 9.2.0. I have some time to debug this issue, as it would be great to pull all my devices off of 6.6.0 to the latest version.