home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.56k stars 30.74k forks source link

Anything using http, including the HA interface and REST, freezes following a LitterRobot POST #101209

Closed rccoleman closed 1 year ago

rccoleman commented 1 year ago

The problem

Ever since updating to the 2023.10.0 beta, my HA frontend has been freezing regularly (maybe 7 times in the last few days). It will be accessible one minute, and then it just becomes inaccessible at some point afterward. The REST interface also freezes, and I just stop getting any info-level logs in the HA log following the point of the freeze. I'm using a container install and can still access the machine that HA is running on and shell into the container to see the normal running processes.

I enabled debug logging and have twice seen a pattern of pylitterrobot making a POST request immediately preceding the freeze:

2023-09-30 21:58:53.469 DEBUG (MainThread) [pylitterbot.session] Making POST request to https://securetoken.googleapis.com/v1/token

After that, I still see paho/MQTT responding with reports for ESPresence, the aarlo custom component, wirelesstags, and pychromecast, but nothing when I try to access the HA web interface or anything from the integrations that were logging immediately before that message. Restarting the HA container gets HA up and running properly again.

I've disabled the LitterRobot integration now to see if it reproduces without it.

What version of Home Assistant Core has the issue?

core-2023.10.0b3

What was the last working version of Home Assistant Core?

core-2023.09.3

What type of installation are you running?

Home Assistant Container

Integration causing the issue

Litter Robot

Link to integration documentation on our website

https://www.home-assistant.io/integrations/litterrobot/

Diagnostics information

I've temporarily disabled the Litter Robot integration to see if I can reproduce the hang without it, and I can't grab the diagnostics data once the hang has happened.

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2023-09-30 21:58:53.177 DEBUG (MainThread) [elkm1_lib.discovery] discover: ('255.255.255.255', 2362) => b'XEPID'
2023-09-30 21:58:53.246 DEBUG (MainThread) [pyhaversion] Version: 2023.10.0b3
2023-09-30 21:58:53.246 DEBUG (MainThread) [pyhaversion] Version data: {'source': 'container', 'channel': 'beta'}
2023-09-30 21:58:53.246 DEBUG (MainThread) [homeassistant.components.version] Finished fetching version data in 0.347 seconds (success: True)
2023-09-30 21:58:53.318 DEBUG (MainThread) [elkm1_lib.discovery] discover: ('192.168.1.229', 2362) <= b'M1XEP\x00@\x9d\x8c\x86\xc8\xc0\xa8\x01\xe5\n)M1XEP\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x01'
2023-09-30 21:58:53.469 DEBUG (MainThread) [pylitterbot.session] Making POST request to https://securetoken.googleapis.com/v1/token
2023-09-30 21:58:54.591 DEBUG (Thread-14 (_thread_main)) [paho.mqtt.client] Received PUBLISH (d0, q0, r0, m0), 'espresense/devices/iphone/family_room', properties=[], ...  (170 bytes)
2023-09-30 21:58:54.647 DEBUG (Thread-14 (_thread_main)) [paho.mqtt.client] Received PUBLISH (d0, q0, r0, m0), 'espresense/devices/irk:0805ec5fa2aac09522c53f6e7d6d2438/guest_bedroom', properties=[], ...  (180 bytes)
2023-09-30 21:58:54.837 DEBUG (Thread-14 (_thread_main)) [paho.mqtt.client] Received PUBLISH (d0, q0, r0, m0), 'espresense/devices/iphone/master_suite', properties=[], ...  (170 bytes)
2023-09-30 21:58:55.198 DEBUG (Thread-14 (_thread_main)) [paho.mqtt.client] Received PUBLISH (d0, q0, r0, m0), 'espresense/devices/iphone/living_room', properties=[], ...  (170 bytes)

Additional information

No response

home-assistant[bot] commented 1 year ago

Hey there @natekspencer, @tkdrob, mind taking a look at this issue as it has been labeled with an integration (litterrobot) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `litterrobot` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign litterrobot` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


litterrobot documentation litterrobot source (message by IssueLinks)

joostlek commented 1 year ago

Just checked out the pylitterbot lib and I am suspecting this is because a new event emitter is ran inside of the _lock, locking up the complete client session it received from HA.

https://github.com/natekspencer/pylitterbot/compare/v2023.4.5...v2023.4.8#diff-9cb69d9798ef9003776ab5c9395cd10e2dd88a265d3d63ee63153154023e24daR86

rccoleman commented 1 year ago

I manually downgraded to 2023.4.5 yesterday and haven't seen a lockup since, so I think this will resolve the issue until the other PR is ready. Thanks!