home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.61k stars 30.37k forks source link

MQTT Discovery does not consistently work in 5.2 #117201

Closed EAGrahamJr closed 5 months ago

EAGrahamJr commented 5 months ago

The problem

Using a "custom" MQTT integration, the devices send discovery messages on startup. All such devices are properly registered in 5.0 but fail to register in 5.2. Unfortunately, the issue was not reproducible once a prior version was used and the devices properly registered.

What version of Home Assistant Core has the issue?

core-2024.5.2

What was the last working version of Home Assistant Core?

core-2024.5.1

What type of installation are you running?

Home Assistant Container

Integration causing the issue

MQTT

Link to integration documentation on our website

https://www.home-assistant.io/integrations/mqtt/#mqtt-discovery

Diagnostics information

home-assistant_mqtt_2024-05-10T17-15-10.980Z.log

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

Note that the second device ("brainz") did NOT show up initially when running 5.2, but after getting added in the previous versions, it did show, so I do not know if the diagnostics will actually show anything.

I tried several combinations of deleting devices and restarting in versions 5.0 and 5.1, as well, but the issue seems to be erratic and not easily reproducible.

home-assistant[bot] commented 5 months ago

Hey there @emontnemery, @jbouwh, @bdraco, mind taking a look at this issue as it has been labeled with an integration (mqtt) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `mqtt` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign mqtt` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


mqtt documentation mqtt source (message by IssueLinks)

simonepittis commented 5 months ago

Similar problem for me too on 2024.5.3, custom Lock MQTT stoped and don't get the status on new release + tasmota device status = NULL

jbouwh commented 5 months ago

Anything in the logs that can help. If possible, set the logging level to debug l

simonepittis commented 5 months ago

Sorry but I can´t now. 50% of my IoT network use MQTT to communicate. I rollback to 2024.5.2.

I checked in the logs before the rollback but no errors, no warnings

simonepittis commented 5 months ago

...but if you want to simulate you can use this two MQTT configs:

water_heater:

lock:

jbouwh commented 5 months ago

Okay, are these se up through discovery? If yes, is that 1) through retained config, 2) an awaited after receiving a birth message discovery message. This info is needed to know what the issue is. Note that if you send discovery message too soon, and MQTT is not yet subscribed to listen, these discovery messages will be ignored.

jbouwh commented 5 months ago

Secondly: Are the entities not added on discovery, or are you missing a state update?

EAGrahamJr commented 5 months ago

In my case, several discovery messages were ignored completely, no retention. HA is up and running with other previously discovered devices and I saw the new device messages on the appropriate topics. Rolling back added the device and now I cannot duplicate the initial error.

bdraco commented 5 months ago

The only change for 5.2 is https://github.com/home-assistant/core/pull/116904

That means we wait 3s so there is a chance the integration will group more of the subscribe messages together. Previously the cooldown was 1s.

Can you try locally reverting that change and see if it solves the issue?

Also you state that it was working fine in 2024.5.0 but you also state the last working version was 2024.5.1. Can you clarify if the last working version is 2024.5.0 or 2024.5.1?

bdraco commented 5 months ago

If you want to give 117267 a try, it can be installed as a custom component

cd /config
curl -o- -sSL https://gist.githubusercontent.com/bdraco/43f8043cb04b9838383fd71353e99b18/raw/core_integration_pr | bash /dev/stdin -d mqtt -p 117267
EAGrahamJr commented 5 months ago

The only change for 5.2 is #116904

That means we wait 3s so there is a chance the integration will group more of the subscribe messages together. Previously the cooldown was 1s.

Can you try locally reverting that change and see if it solves the issue?

Also you state that it was working fine in 2024.5.0 but you also state the last working version was 2024.5.1. Can you clarify if the last working version is 2024.5.0 or 2024.5.1?

That's hard to quantify, as I rolled to 5.0 to verify because that was the last working version I had, then I rolled to 5.1 and the device was successfully discovered. That's when I found that after being once discovered, 5.2 would properly re-configure the device, so I'm not entirely sure whether or not 5.1 is actually working or not.

I will attempt to clear things up again and see if I can repro, but as a former software developer, I LOATHE intermittent issues!!!!!!

simonepittis commented 5 months ago

ok... I made extra tests. now works on 2024.5.3, but I need to restart the MQTT broker to fix permanently. after the HA restart the MQTT can't connect to MQTT broker, or only randomly, but If I restart the broker HA reconnect it and works.

Works on 2024.5.0 Works on 2024.5.1 Random Issues on 2024.5.2 Random Issues on 2024.5.3

jbouwh commented 5 months ago

Lets see how 2024.5.4 does the job as the buffer size and subscribe debouncer should be optimal now.

simonepittis commented 5 months ago

looks OK now on 2024.5.4

EAGrahamJr commented 5 months ago

I cannot reproduce the issue (I set up a lot of dummy devices over the course of this issue :grinning: ), so I am satisfied with the result.