Closed Silvenga closed 4 years ago
Hey there @kane610, mind taking a look at this issue as its been labeled with a integration (unifi
) you are listed as a codeowner for? Thanks!
This may be related to #28832. It appears the stack just keeps going - does python have a max stack size?
I originally opened #28832 and things have generally been OK since @Kane610 merged #29643. However, just woke up today to a barely responsive HA and my homeassistant.log has exploded to 1.6GB in size, full of this repeating stack trace - so there is something still going on here. Thankfully this time I caught it before it filled my disk, since the disk fill scenario means I get to reconfigure/re-enable the Nabu Cloud as well as wipes out any custom repositories in my Hassio Addons config.
Environment:
arch | x86_64 |
---|---|
dev | false |
docker | true |
hassio | true |
os_name | Linux |
python_version | 3.7.5 |
version | 0.103.5 |
virtualenv | false |
2019-12-31 03:54:14 ERROR (MainThread) [homeassistant.helpers.entity] Update for device_tracker.
I also had this happen this morning. Running Home Assistant 0.103.4 in docker.
2020-01-01 09:24:42 ERROR (MainThread) [homeassistant.helpers.entity] Update for device_tracker.ipad fails
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 281, in async_update_ha_state
await self.async_device_update()
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 459, in async_device_update
await self.async_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/device_tracker.py", line 161, in async_update
await self.controller.request_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/controller.py", line 167, in request_update
await self.progress
File "/usr/src/homeassistant/homeassistant/components/unifi/controller.py", line 177, in async_update
await self.api.clients.update()
File "/usr/local/lib/python3.7/site-packages/aiounifi/api.py", line 20, in update
raw = await self._request('get', self._path)
File "/usr/local/lib/python3.7/site-packages/aiounifi/controller.py", line 69, in request
response = await res.json()
File "/usr/local/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 1040, in json
return loads(stripped.decode(encoding))
File "/usr/local/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 4177 (char 4176)
2020-01-01 09:24:42 ERROR (MainThread) [homeassistant.helpers.entity] Update for device_tracker.office fails
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 281, in async_update_ha_state
await self.async_device_update()
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 459, in async_device_update
await self.async_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/device_tracker.py", line 254, in async_update
await self.controller.request_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/controller.py", line 164, in request_update
return await self.progress
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 281, in async_update_ha_state
await self.async_device_update()
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 459, in async_device_update
await self.async_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/device_tracker.py", line 161, in async_update
await self.controller.request_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/controller.py", line 167, in request_update
await self.progress
File "/usr/src/homeassistant/homeassistant/components/unifi/controller.py", line 177, in async_update
await self.api.clients.update()
File "/usr/local/lib/python3.7/site-packages/aiounifi/api.py", line 20, in update
raw = await self._request('get', self._path)
File "/usr/local/lib/python3.7/site-packages/aiounifi/controller.py", line 69, in request
response = await res.json()
File "/usr/local/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 1040, in json
return loads(stripped.decode(encoding))
File "/usr/local/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 4177 (char 4176)
and then the stack trace just continues to grow with these lines repeating:
File "/usr/src/homeassistant/homeassistant/components/unifi/device_tracker.py", line 161, in async_update
await self.controller.request_update()
File "/usr/src/homeassistant/homeassistant/components/unifi/controller.py", line 164, in request_update
return await self.progress
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 281, in async_update_ha_state
await self.async_device_update()
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 459, in async_device_update
await self.async_update()
I saw the issue occur again on my HA instance, now running 0.103.5
. Restarting the instance mitigates the problem.
When the bug is in progress, it eats about 6GB of disk (for logs) every 30 minutes.
I've had to disable the integration for now since the issue occurred again, twice last night 😢
I can always enable it again if that helps in diagnosis.
Also have this issue. Exact same symptoms and logs and treatment, my log was up to 19GB last time...
Also disabling for now.
Came here looking to see what's up as yesterday I updated to 103.6 from 103.4 and then this morning, I woke up to a 25GB home-assistant.log
file with the aforementioned traces filling the entire log.
Adding to this thread. I've been having this issue for over a week now. Took me awhile to actually grab the logs. Mine operates fine throughout the day, but seems to reliably fail overnight. Restarting the docker container fixes the issue as others have noted, but only for a day.
Just spitballing here, but could this have anything to do with something nightly that the Unifi controller does? I've got my controller set to scan the WiFi networks and auto-optimize the channels selected at ~3AM. Ostensibly this feature is noted that it may result in network connectivity problems during the scan.
EDIT: For reference, this is called the 'Wi-Fi AI' feature on Unifi.
Adding to this thread. I've been having this issue for over a week now. Took me awhile to actually grab the logs. Mine operates fine throughout the day, but seems to reliably fail overnight. Restarting the docker container fixes the issue as others have noted, but only for a day.
Just spitballing here, but could this have anything to do with something nightly that the Unifi controller does? I've got my controller set to scan the WiFi networks and auto-optimize the channels selected at ~3AM. Ostensibly this feature is noted that it may result in network connectivity problems during the scan.
EDIT: For reference, this is called the 'Wi-Fi AI' feature on Unifi.
Could you disable it to see if that affects anything. Also if you have other similar features enabled if you could disable them as well. I've never experienced this bug myself and I don't have anything like that enabled, though that doesn't mean there is any correlation here
I will probably rewrite the whole update mechanism which might be at fault here. I just want to finish rewriting my automations first which have been neglected for far too long
Adding to this thread. I've been having this issue for over a week now. Took me awhile to actually grab the logs. Mine operates fine throughout the day, but seems to reliably fail overnight. Restarting the docker container fixes the issue as others have noted, but only for a day. Just spitballing here, but could this have anything to do with something nightly that the Unifi controller does? I've got my controller set to scan the WiFi networks and auto-optimize the channels selected at ~3AM. Ostensibly this feature is noted that it may result in network connectivity problems during the scan. EDIT: For reference, this is called the 'Wi-Fi AI' feature on Unifi.
Could you disable it to see if that affects anything. Also if you have other similar features enabled if you could disable them as well. I've never experienced this bug myself and I don't have anything like that enabled, though that doesn't mean there is any correlation here
I'll test that tonight. Don't think I have anything else enabled that does any auto-scanning/channel hopping.
Better anyways, because if that solves the issue I'd prefer it to... since I use the Unifi system for my device tracking... with it disabled I have to use some other presence detection system.
At least for me the scan did not trigger the issue, it happened mostly in the evenings. Overnight and mornings would be fine but not late afternoon/evening.
Seeing the same issue. I've just now disabled Wifi-AI to see if it helps, but I've been seeing the issue happen anywhere from 6 to 9 AM, where Wifi-AI was scheduled to run at 3 am.
I found some details on a websocket push api for unifi that might resolve this issue.
Just to add, I don't have WiFi AI enabled, I've experienced the issue during the day as well.
So far I haven't had this issue again after disabling Wi-Fi AI... that being said, I've also upgraded to the latest 0.104 release, and made a bunch of other automation changes. But, so far Unifi integration seems to be working without munching all my CPU.
If I was to guess, some of my other factor fixed this problem. Or I just haven't waited long enough yet. I'll check back if I see this happen again.
Just updated to HassOs 3.8 and HA 0.104.2. Tried to enable the integration again, but it immediately started filling the logs, so no luck with that.
Im reworking the UniFi integrations way of retrieving data from the controller. I can't promise if it will be ready this or next release. The changes are substantial and I need to verify behaviour.
I hope someone might be willing to try it out well in advance
Hi!
I've integrated the improved data handling in the UniFi integration to HASS dev branch. Would you mind trying it out to see if it improves your situation
Is there any documentation on how to do that ”safely” when running HassOS?
@cadwal I have no idea
I've updated my UDMP to 1.6.4, with the unified SSO system and API - so I don't believe I can test the changes anymore. 😿
I've updated my UDMP to 1.6.4, with the unified SSO system and API - so I don't believe I can test the changes anymore. 😿
It just needs some extra reverse engineering
I've updated my UDMP to 1.6.4, with the unified SSO system and API - so I don't believe I can test the changes anymore. crying_cat_face
It just needs some extra reverse engineering
Of course, I'm currently reversing the protect api. It's all cookie based now.
I hate this unified api soooo much. Send an expired cookie, "let's wait a minute before responding with a 500".
I've just tried using a new version of HASSio 105.4, and turned on Unify Integration again.. after about 10 hours it halted HA again..
Changes on data management is part of the dev track and will be released with 0.106. You guys who experience this issue should really try it out to see if we get any improvements
Looks like it works. At least 2 days without a halt.
Great! Let's hope this carries over to the major release for all affected
I noticed that sometimes the tracker shows that the device is at home, although it is actually not on the network. (It is not present on the Unify Devices tab). Is it possible to debug the plugin, find out what happened, why the device is at home?
@amaximchuk you can enable debug for UniFi integration per integration instructions to see what happens. A new issue was just created with a similar issue though reported on 0.105 https://github.com/home-assistant/home-assistant/issues/32170
What do you guys think, is this issue solved?
Don't post about unrelated issues here. Create a separate issue in that case.
I've been running for a bit over 24 hours now and have seen no sign of the original issue. Thanks for your work on this Robert.
No problems of that kind here either, but also no presence detection. Devices are always present, will have to look at more details to understand what is going on.
Right, so yes, no more crashes and logfiles filling up, but HA does no longer detect when devices goes away. A restart makes it detect that a device is away. Devices that "arrive" is detected immediately though.
The logs logs indicates that detection runs once a second?
That is a separate issue. Workaround merged and targeted for 0.106.2. https://github.com/home-assistant/home-assistant/issues/32170
Ah ok, well I'd consider this one done then.
I'll keep it open a couple of days more. But great if this is indeed solved 🎉
Home Assistant release with the issue:
0.103.0
Last working Home Assistant release (if known):
0.102.3
Operating environment (Hass.io/Docker/Windows/etc.):
Hass.io
Integration:
https://www.home-assistant.io/integrations/unifi/
Description of problem:
Occuring twice after upgrading to
0.103.0
(upgraded to0.103.4
now), HA seems to freeze (not 100% correct, HA just takes minutes to respond to API requests). The logs are spammed with the below stack trace. Most of the CPU is eaten by HA (just a poor i3). Restarting thehomeassistant
container mitigates the problem.Problem-relevant
configuration.yaml
entries and (fill out even if it seems unimportant):Traceback (if applicable):
Additional information: