Closed talormanda closed 3 weeks ago
Ouch, that sucks! I can't see anything in the diagnostics to explain why the whole system would bog down. It does look like you're not getting regular updates from the proxies (eg the Garage - main - left is showing 7s or worse intervals for the pawscout tag you have configured) - but this doesn't explain the UI etc being slow.
The times I've seen the UI going slow has usually been if there are lots of extra sensors enabled and they are changing often (eg all the unfiltered distance to... sensors that are disabled by default). Do you have many of those turned on?
What sort of hardware are you running on? It should run fine on most things, but a Pi will get bogged down with many sensors - but it looks like you are (probably?) running in a VM on a pc/server/nuc, is that right?
If you are able to get some debug logging that might help me track down further - providing your UI is stable enough for you to grab it. In Bermuda, enable debug logging, wait for 30 seconds or so, then disable it - the browser should then present you with the log file to save. It will have personal info in it like MAC and possibly IP addresses etc, so you might prefer to email it to me ash@ajg.net.au or upload it to my nextcloud https://cloud.ajg.net.au/index.php/s/JpeXDnZQGeXqqHB
I'm mostly interested in seeing how long the update cycles are taking, and if Bermuda is doing anything weird on each update, like re-creating sensors etc - but the whole log file would be useful if you are OK with sharing it.
I am about to head to bed though, so it will be a bit before I can take a look at your logs. I'll also try replicating the steps of setting up, renaming, perhaps reinstalling and adding/renaming again after that as well, in case there's something going on in there. This might be a bit tricky to track down though.
I basically went to rename the only device I added, the pawscout ibeacon, and upon hitting save, my entire system went offline. I could no longer refresh the page or visit the URL to home assistant again. I could however, log into proxmox and reboot the VM from the terminal.
😮 I'll try that out on my dev box and see if I can replicate the issue.
I can attempt to mess with it to try and get it to break again and record it. I have had HA for almost 2 years?, and it never did this until I started using bermuda. So it has to be related.
Yeah, it could be a race condition somewhere. There have been a couple that were exposed with the async changes made in July, but all the ones I could find have been addressed. But they can be super hard to find! Especially since I learned as I went with this project, and while the dev docs are ok they don't do much to teach best practice, so sometimes you just don't know what it is that will come back to bite you later!
Before you hunker down behind the blast shield, something that might be helpful is to have logging running in an ssh session, that way you can still get it out if the rest of HA locks up. I assume you probably already know how to do that, I usually use docker logs -f --tail-80 homeassistant
, and enable debug logging via the integration first.
I am not 100% versed on every thing, but the command is familiar. I will make sure to do that, but it may get tricky as the IP just stops responding when it begins to act up. Is it possible to have a log running from the main terminal of HA prior to starting?
As long as HAOS is running, you can run that docker command to get the logs - the difficulty is that if/when homeassistant restarts the docker container exits, so the logs stop until you restart the command. (I am assuming that you are running the HAOS image since you mentioned a proxmox VM).
As long as your ssh session is running from another machine, you'll get to see the logs up to the instant that the HA vm stops responding, since it will be logging to your ssh session in real time.
By the way, have you ever installed this integration? https://github.com/kvj/hass_Bluetooth_Proxy
It definitely looks like it might be the root cause of a number of performance issues in Bermuda, as it seems to leave behind 10s of thousands of stale bt advertisement records even after being removed. On some systems it causes Bermuda to lock up the whole machine on start-up, but below a certain limit it would cause slow-downs, or lock-ups at various points.
If so, the quick-fix if so is to delete or rename the file at ./config/.storage/bluetooth.remote_scanners
- you have to do it while HA is stopped though, otherwise it will just re-create it when HA exits. That file should only be a few KB in size, if it's a MB or more it's a likely culprit.
By the way, have you ever installed this integration? https://github.com/kvj/hass_Bluetooth_Proxy
It definitely looks like it might be the root cause of a number of performance issues in Bermuda, as it seems to leave behind 10s of thousands of stale bt advertisement records even after being removed. On some systems it causes Bermuda to lock up the whole machine on start-up, but below a certain limit it would cause slow-downs, or lock-ups at various points.
If so, the quick-fix if so is to delete or rename the file at
./config/.storage/bluetooth.remote_scanners
- you have to do it while HA is stopped though, otherwise it will just re-create it when HA exits. That file should only be a few KB in size, if it's a MB or more it's a likely culprit.
Nope, I do not have that installed. I just use ESPHome for my proxies. I haven't gotten time to sit down and test all of this yet, but I did not forget. I want to clone my VM so I can really mess around more without having to worry.
No worries, all good. Just wanted to raise it just in case.
I also experienced this issue, after adding new device and trying to rename it whole HA stopped working, after restart everything is back to normal and renaming works again. This issue reppeted for last 4 devices.
Wow, well that sucks! When you rename it, which bit are you renaming? The ones I can think of are:
Renaming the whole "device":
Renaming an entity (like the distance or area sensor):
I'm guessing it's the "device" but wanted to check. There is a messy clump of code around monitoring for changes to the device registry
, so it's possible that the rename is somehow triggering a race condition that in some instances can lead to an async-powered infinite loop. I'll focus my testing there for now, but let me know if I'm on the right track with it being a device rename.
Still didn't get to test due to halloween, but I can comment to say I only changed it from here so far.
I am attempting to get it to crash now. I find I can log the output to my local machine over ssh using this:
ssh root@192.168.0.7 'tail -f /config/home-assistant.log' | tee local_log_file.log
Will report back when I get somewhere.
There were a number of fixes in the now released v0.7.0 that are specifically around the things that Bermuda did when it noticed a device registry change. Hopefully this has resolved the issue you experienced.
I'm going to close this as I suspect the problem is addressed, but please feel free to re-open if you experience it again with the v0.7.0 or later releases. A fresh diagnostics and logs would probably be warranted in that case.
I'll monitor and report back if things change.
Configuration
Describe the bug
I deleted my devices and started over, added 1 device, then attempted to rename it. My whole machine came to a crawl. Nothing was connected anymore and I couldn't load Home Assistant. I was able to go to my VM and manually reboot it from there. Has happened 2-3 times now.
Diagnostics
config_entry-bermuda-01JBA2JT53JA9P8XZ7G4SRX6X2 (1).json