mrlt8 / docker-wyze-bridge

WebRTC/RTSP/RTMP/LL-HLS bridge for Wyze cams in a docker container
GNU Affero General Public License v3.0
2.64k stars 165 forks source link

MQTT discovery should be re-sent on HA birth #920

Open jhansche opened 1 year ago

jhansche commented 1 year ago

Viewing the MQTT Device representing a wyze camera, sometimes all the entities report "Unavailable". Took me a while to realize the exact scenarios that cause this, but it now makes sense that this happens any time Home Assistant is restarted, or the MQTT integration is reloaded.

This happens whether ON_DEMAND is set to true or false. And since I'm using Frigate to consume the video frames anyway, I believe that should mean that even with ON_DEMAND=true, it should generally always have a client, so I don't think that is what's happening.

I believe what's happening is that the MQTT integration loses the originally published MQTT discovery messages that the Docker Wyze Bridge sent upon startup, and therefore all the entities start up as unavailable because the integration doesn't remember them or doesn't have up to date information for them.

The issue is resolved if I restart the bridge add-on after restarting HA or the MQTT integration.

What I believe is missing is handling the birth/LWT messages from HA: https://www.home-assistant.io/integrations/mqtt/#birth-and-last-will-messages. The general tenet of MQTT discovery is that the bridge should send its discovery details any time it receives the birth message from HA, which is by default sending payload online to topic homeassistant/status (but you'll probably want to make it configurable like the DTOPIC config is currently, and also because the MQTT integration can be configured to use a different topic or payload). If it also needs to do something to release resources while HA is offline, it can also handle payload offline. Thinking new configs:

MQTT_DSTATUS_TOPIC="homeassistant/status"
MQTT_DSTATUS_ONLINE="online"
# MQTT_DSTATUS_OFFLINE="offline"  ## if needed for something

Then any time it receives MQTT_DSTATUS_ONLINE over MQTT_DSTATUS_TOPIC, it should re-publish all cameras to wyze_discovery(), like it does on startup.

jhansche commented 1 year ago

I just noticed this appears to be a dupe of #907 which is marked as fixed in v2.3.10. 🤦‍♂️

However, I'm running v2.3.10, and this is not working currently. The above behavior is what I experience on v2.3.10 and HA 2023.7.2.

Now that I've pulled the latest code, I do now see the /status subscription - however the reaction to this topic receiving "online" only seems to re-publish a wyzebridge/status="online" message, which is not enough for the entities to be accessible again. I believe we need to re-send the device's discovery config when this HA status=online message is encountered, instead of just sending its own status=online message.

It may be sufficient to re-send the current values as well, over each camera's state_topic for each entity in this case? But in the case of the screenshot above, you can see that the switches are disabled, which means it is not just a problem of having incomplete state at the time - the entities are literally unusable in this state, until the discovery message is re-sent.

EDIT: when clicking to view one of the entities listed as Unavailable, HA reports:

This entity is no longer being provided by the mqtt integration. If the entity is no longer in use, delete it in settings.

This is additional evidence that the problem is caused by the MQTT integration not receiving the discovery config after coming back online.

And to prove the theory of whether re-sending the current state would work or not, I tried publishing a message to emulate what the bridge would send to reflect state changes:

topic=wyzebridge/pan-1/night_vision
payload=2 # or 3

but the entity remains unavailable and the state is not updated in HA. After I restart the bridge container, which re-sends the discovery message, and the entities are available again: I can send the same payload to emulate state changes from the bridge, and the switch status reflected in the HA UI correctly (2=off, 3=on). So this shows that simply re-sending the current states will not be enough either. The only thing that appears to work is to re-send the discovery configs.

jhansche commented 1 year ago

Confirmed in v2.3.11, when reloading the MQTT HA integration, the Wyze entities come back within ~5-10 seconds, without having to restart anything else! 🎉

Thanks!

jhansche commented 1 year ago

One minor note: although the entities become available again after a reload, the current status of each entity remains unknown until either I manually toggle the entity, or restart the bridge add-on. That is not a big deal imo, because the controls still work which is what's most important.

mrlt8 commented 1 year ago

Hmm, would a retain flag help in this situation or would that get cleared out when HA restarts?

jhansche commented 1 year ago

Yeah, retain would resend the message to future subscribers, including HA when the integration reloads. What I'm not sure of is whether that would persist across broker restarts 🤔 The risk of retaining a message even after the broker restarts is ending up with orphaned messages, such as if you delete the camera. Also not clear if the retained messages would continue to persist even after the Bridge client disconnects.

But what I'm reading says it'll otherwise act just like a normal message, which I think means the retain flag is not persistent. So shouldn't have that problem.

The retain flag would also work for discovery messages too btw, including when the ha integration reloads, as long as the same orphan issue doesn't happen.

giorgi1324 commented 1 year ago

Yeah same here, the issue is still present in 2.3.13

jhansche commented 1 year ago

I think the current issues may be somewhat different. I do see that sometimes the MQTT entities go unavailable periodically. But I haven't tracked down the root cause. E.g. it may be that it loses connectivity to the broker, or it may be some other state that gets mixed up in the bridge. Restarting the bridge container brings everything back, but that doesn't mean the problem is in the bridge necessarily.

51av0sh commented 9 months ago

To confirm the above, this issue also happens to the Govee to MQTT add-on I recently installed. Restarting the add-ons fixes the issue but need to figure out what's causing this.

jhansche commented 9 months ago

I think the issue happens, at least from what I've been able to determine without really digging into it, when restarting HA, without restarting the MQTT broker (mosquito add-on in my case) or the Wyze bridge add-on. I recently added an automation triggered by HA start, that checks for one of my Wyze entities being Unavailable after some period of time, and automatically restarts the add-on. It seems like that has improved things.

Given that, I think the problem is that when the MQTT integration in HA reloads, it loses the Wyze discovery configs. Therefore all entities become unavailable. Restarting the bridge add-on causes it to reconnect to the broker and resend the discovery messages, and everything comes back up.

@mrlt8 Did you end up adding the "retain" flag on discovery, mentioned here? https://github.com/mrlt8/docker-wyze-bridge/issues/920#issuecomment-1646408350

mrlt8 commented 9 months ago

@jhansche Could you try the latest dev image?

jhansche commented 9 months ago

I tried the dev image, and after reloading the MQTT integration, I see the Wyze entities come back after about 5-10 sec.

But then I switched back to v2.6.0, and I'm still seeing the entities come back up. So it seems like my assumptions are wrong somewhere 🤔 either that, or the dev branch's retained discovery messages was still retained even after switching back to the release version? I could try the 2.6 image again after a restart of the broker and HA, and see if it still auto-recovers. If it does, then my assumption that it's the MQTT integration losing the discovery message doesn't hold water

I'm still having trouble pinpointing exactly which component is the culprit (as in, which one triggers entities to become unavailable and not automatically recover):

jhansche commented 9 months ago

That's what it was... I restarted the mosquito broker, after switching back to v2.6, and now when I reload the MQTT integration, my entities go unavailable and they don't recover. I have to restart the Wyze bridge to get them back.

So it does look like the retain flag in the dev image is what fixed it. I guess it just continued to be retained even after switching back, which is what I was not expecting.

The retain flag allows the MQTT integration to receive the original discovery message when it reloads; and the last-will message is what will tell HA that it's unavailable until it reconnects.

On the topic however, I was looking at my mosquito logs, and it looks like the wyze user is opening and closing several connections, every few seconds:

2024-01-10 02:31:43: New connection from 172.30.33.12:43131 on port 1883.
2024-01-10 02:31:43: New client connected from 172.30.33.12:43131 as auto-35393D12-D47B-8F37-7CA9-A9836F7979FA (p2, c1, k60, u'wyze').
2024-01-10 02:31:43: Client auto-35393D12-D47B-8F37-7CA9-A9836F7979FA disconnected.
2024-01-10 02:31:44: New connection from 172.30.33.12:42991 on port 1883.
2024-01-10 02:31:44: New client connected from 172.30.33.12:42991 as auto-5C42A771-04E9-7F7F-3EBD-589E87214C8F (p2, c1, k60, u'wyze').
2024-01-10 02:31:44: Client auto-5C42A771-04E9-7F7F-3EBD-589E87214C8F disconnected.
2024-01-10 02:31:44: New connection from 172.30.33.12:46393 on port 1883.
2024-01-10 02:31:44: New client connected from 172.30.33.12:46393 as auto-F8A1EB48-B8A7-4148-6A7A-965A541E1700 (p2, c1, k60, u'wyze').
2024-01-10 02:31:44: Client auto-F8A1EB48-B8A7-4148-6A7A-965A541E1700 disconnected.

Is that to be expected? Not related to this issue either way, so I think this can be closed again and I can open a new issue for the connections, if you want

mrlt8 commented 9 months ago

Hmm, seems like we might be able to set a birth message instead?

https://www.home-assistant.io/integrations/mqtt/#how-to-use-discovery-messages

jhansche commented 9 months ago

Hmmm... All of this was sounding like deja vu, and then I looked at the issue description☝️😅

Yes, it looks like that should be sufficient.

However, if the bridge doesn't stay connected to the broker, it won't see the HA birth message.