Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge πŸŒ‰, get rid of your proprietary Zigbee bridges πŸ”¨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.08k stars 1.68k forks source link

After each restart of HA connection to MQTT/zigbee2mqtt is lost and all devices are unavailable #4912

Closed skumka closed 3 years ago

skumka commented 3 years ago

Problem

After each restart of HA connection to MQTT is lost and all devices are unavailable. You need to stop and start zigbee2mqtt to get it to work again.

Description

Restart HA from within HA. When UI starts you can see the status of devices and switches available/active for few seconds. After that time all become unavailable. Problem reported to HomeAssistant but after they analyze the problem seems to be in zigbee2mqtt. To fix the problem you need to stop and restart zigbee2mqtt service With version 1.15 all worked fine

Debug info

Zigbee2MQTT version: 1.16 and 1.16.1

marklagendijk commented 3 years ago

I'm having the same issue.

Koenkk commented 3 years ago

Can you provide the debug log when Zigbee2MQTT is started for the first time after a HA restart?

To enable debug logging set in configuration.yaml:

advanced:
  log_level: debug

Do you use the availability feature?

dennyreiter commented 3 years ago

I'm having this problem also, but, I just restarted HA and and it didn't happen. I thought it was something I had done, because I have/had been using mqtt_statestream between two HA instances and a script to force sensor discovery, but Wednesday my main instance crashed and was filled with sensors like sensor.shower_temperature_2_2_2_2_2_2_2_2_2_2_2_2 So I'm thinking that even though this might be a Z2M issues, that something changed in HA MQTT to reveal it and my statestream skulduggery. I have debug logging set to turn on for the next restart if it happens again.

bobbyschuitemaker commented 3 years ago

Same here. Every day i have to restart once or twice. I will see if I can send a log.

fperezm commented 3 years ago

I'm also having the same problem, I thought it was my thing. I put the log at debug level but nothing is seen, only that the devices stop updating and does not send anything through MQTT

marklagendijk commented 3 years ago

I just downgraded zigbee2mqtt to 1.15.0 and everything seems to work properly. If the issue re-appears even with 1.15.0 I'll report back.

I wouldn't be surprised if #775 Home Assistant: provide multiple availability topics through availability is indeed the issue.

skumka commented 3 years ago

I reverted to the 1.15 and problem does not exist. So 100% it is related to 1.16 changes. I am sure for 90% chances that due to "ameliorations" for MQTT

Koenkk commented 3 years ago

Please provide the logging as requested in https://github.com/Koenkk/zigbee2mqtt/issues/4912#issuecomment-724112855 , otherwise it is impossible for me to fix anything.

skumka commented 3 years ago

@Koenkk OK, I moved again to 1.16.1 and here it is the link to the recorded video: https://streamable.com/r2pqre

and log file: log_2020-11-11.20-37-57.log

IronButterfly commented 3 years ago

The same issue with 1.16.1

jesperldk commented 3 years ago

this is a problem for me as well, I'm on HA 0.117.2

Koenkk commented 3 years ago

@skumka

dennyreiter commented 3 years ago

I'm not sure that this is solely a Z2M problem. I'm now seeing this with most of my Tasmota devices, also.

skumka commented 3 years ago

@Koenkk I had it enabled few weeks back for 2 days, but I switched it off per your recommendation linked with other issue I have reported. What is the reason you ask about it? WRT you 2nd question. I do not have availability activated so I am not sure if this should be checked.

jesperldk commented 3 years ago

Hmm, I see it only for zigbee2mqtt; I do not see it for any of my Tasmotas nor for miflora-mqtt.

On Fri, 13 Nov 2020 at 04:30, Denny Reiter notifications@github.com wrote:

I'm not sure that this is solely a Z2M problem. I'm now seeing this with most of my Tasmota devices, also.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Koenkk/zigbee2mqtt/issues/4912#issuecomment-726485763, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIILC6UWSE353OOUUE5LR3SPSR4BANCNFSM4TPPRPSQ .

dennyreiter commented 3 years ago

I noticed "mqtt: Exception in availability_message_received" in my logs and Google led me to this issue, which seems to still be open?

https://github.com/home-assistant/core/issues/40166

I did recently have config changes (CC2531 died and I replaced it with a CC2652R.) Wish it would happen again so I could grab the logs 😁

Koenkk commented 3 years ago

@skumka can you check: If you use a MQTT client and subscribe to zigbee2mqtt/Entrance Door/availability, do you get anything?

I believe this issue happens after having turned on the availability feature once (while having it off now). There is still a retained offline message in zigbee2mqtt/Entrance Door/availability which will HA receive on startup. But I need your help to confirm :)

jesperldk commented 3 years ago

I have never had the availability feature turn on.

On Fri, 13 Nov 2020 at 14:42, Koen Kanters notifications@github.com wrote:

@skumka https://github.com/skumka can you check: If you use a MQTT client and subscribe to zigbee2mqtt/Entrance Door/availability, do you get anything?

I believe this issue happens after having turned on the availability feature once (while having it off now). There is still a retained offline message in zigbee2mqtt/Entrance Door/availability which will HA receive on startup. But I need your help to confirm :)

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Koenkk/zigbee2mqtt/issues/4912#issuecomment-726770975, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIILCYIWZJDE4DEVCRJVLTSPUZS3ANCNFSM4TPPRPSQ .

skumka commented 3 years ago

@Koenkk I have checked: image

Now I understand. I have it for every zigbee sensor....

How to get rid of this?

jesperldk commented 3 years ago

Well, I don't think I ever had availability enabled. Even after reading the doc I am not quite sure how to enable it if I wanted to ;-)

However, when I look in the homeassistant config topics, they contain "availability_topic":"zigbee2mqtt/bridge/state", but there is no value for the "zigbee2mqtt/bridge/state" topic. Also, the topic name seems strange. The homeassistant config topic for the same device also contains "state_topic":"zigbee2mqtt/motion2", and that is a much more meaningful topic name, and indeed this topic contains the state as expected.

The "availability_topic" seems incorrect? Could HA have changed its reaction on this broken reference?


edit: well, I was wrong, there is a value for zigbee2mqtt/bridge/state, don't know why I overlooked it. It is the online status of the bridge. And it does have a retained flag. Don't know why it does not work after a HA reboot???

Koenkk commented 3 years ago

@skumka issue should be fixed

Changes will be available in the latest dev branch in a few hours (https://www.zigbee2mqtt.io/how_tos/how-to-switch-to-dev-branch.html)

@jesperldk looks the root cause for you is different, to be sure, can you first check with the latest dev and if that doesn't fix provide me the debug log when Zigbee2MQTT starts?

To enable debug logging set in configuration.yaml:

advanced:
  log_level: debug
skumka commented 3 years ago

@Koenkk ok... but I would prefer to merge the code changes into my current stream rather than switch to dev. I hope this will work for you too.

Koenkk commented 3 years ago

@skumka you mean master? These will be included in the next release on 1 December. If you don't want to switch to dev you can clear all retained MQTT messages (https://community.openhab.org/t/clearing-mqtt-retained-messages/58221)

skumka commented 3 years ago

Just for test I can make a merge on my environment (master) and let you know if it works.

skumka commented 3 years ago

I know now that cleaning retained msgs will solve my issue, but for the fact you spend time for fixing it in code I may contribute by testing :-)

skumka commented 3 years ago

@Koenkk Updated to dev and tested the fix. It works with the code changes. Closing the issue.

jesperldk commented 3 years ago

@Koenkk this fixes things for me as well, although the case seemed different. (well, pulling fixed it, I was on dev 1.15.0 #ed8b4e5 before)

I did notice some differences:

Checking the docs, I can see this behavior comes from cache_state: true and family. I newer have had them in my configuration.yaml, did the defaults perhaps change? I can see I ought to have had them on, but everything worked fine until it stopped working at every HA reboot since quite recently.

Anyway, thanks a lot, I'm a fan :-)

Koenkk commented 3 years ago

@jesperldk bridge/state always has to retain flag set to true, but if you use an MQTT client this will only show as retain true when you initially connect with the server. cache_state: true was always the default and this did not change. Do I understand correctly that the problem is fixed for you?

jesperldk commented 3 years ago

Yes, it is fixed. Before, starting a few weeks ago, when I restarted HA alle z2m devices became unavailabe until I also restarted z2m. After upgrade it works again. Weird.

lΓΈr. 14. nov. 2020 kl. 16.40 skrev Koen Kanters notifications@github.com:

@jesperldk https://github.com/jesperldk bridge/state always has to retain flag set to true, but if you use an MQTT client this will only show as retain true when you initially connect with the server. cache_state: true was always the default and this did not change. Do I understand correctly that the problem is fixed for you?

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Koenkk/zigbee2mqtt/issues/4912#issuecomment-727224854, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIILCYNOVDOJCCDFPULD73SP2QGHANCNFSM4TPPRPSQ .

jesperldk commented 3 years ago

@Koenkk

Damn, I still have the problem. Apparently only now and then. I am fairly certain that it came when I upgrade HA to some 0.117.x version. Probably some problem in HA? However, my tasmotas and miflowers does not have this problem.

I can make some debug logs tomorrow. However, it seems like the problem is how HA interprets what is in MQTT...