fsaris / home-assistant-awox

AwoX mesh light integration for Home Assistant
MIT License
84 stars 23 forks source link

No response error. #24

Closed sfb247 closed 2 years ago

sfb247 commented 3 years ago

I keep getting devices going offline after a command. then coming back on in few minutes.

No response received after command! - start: 2021-02-15 19:37:37.004011, now: 2021-02-15 19:37:37.083717, last response: 2021-02-15 19:37:36.961094

sfb247 commented 3 years ago

Logger: custom_components.awox.awox_mesh Source: custom_components/awox/awox_mesh.py:244 Integration: AwoX Mesh lights (documentation, issues) First occurred: 10:51:57 AM (14 occurrences) Last logged: 2:24:52 PM

Timeout executing command, probably Bluetooth connection is lost/frozen, re-connecting

sfb247 commented 3 years ago

Logger: custom_components.awox.awox_mesh Source: custom_components/awox/awox_mesh.py:253 Integration: AwoX Mesh lights (documentation, issues) First occurred: 10:43:42 AM (44 occurrences) Last logged: 2:25:43 PM

No response received after command! - start: 2021-02-16 14:24:20.495533, now: 2021-02-16 14:24:20.699103, last response: 2021-02-16 14:24:19.627525 No response received after command! - start: 2021-02-16 14:25:42.927083, now: 2021-02-16 14:25:43.005477, last response: 2021-02-16 14:25:42.922918 No response received after command! - start: 2021-02-16 14:25:43.011121, now: 2021-02-16 14:25:43.086836, last response: 2021-02-16 14:25:42.922918 No response received after command! - start: 2021-02-16 14:25:43.090015, now: 2021-02-16 14:25:43.418508, last response: 2021-02-16 14:25:42.922918 No response received after command! - start: 2021-02-16 14:25:43.504853, now: 2021-02-16 14:25:43.581813, last response: 2021-02-16 14:25:43.501907

fsaris commented 3 years ago

I need to drop these log messages as this most times a false negative.

I also see my bluetooth connection dropping every x minutes (sometimes hours). Can't fully get what's happening but looks like the connection freezes.

For now the only solution seems to auto retry connecting again. That's what you probably see also in the history of your devices. Because during this time the device is set to unavailable. Did that to give a little feedback in the UI. But we could drop that and only set it to unavailable when we couldn't reconnect after 60seconds or so

fsaris commented 3 years ago

Do you have issues controlling your lights or is this only a remark after checking the logs?

JoeyGnarf commented 3 years ago

I get the same messages in the logs:

When the entity awox_mesh.kubj1zal is connected commands are accepted by all 12 devices in the mesh (most often - the mesh has 1 or 2 lamps that seem to be dropped from time to time). The entity's connection status changes a lot. Sometimes it's stable for a few hours, sometimes it's mere seconds.

sfb247 commented 3 years ago

Do you have issues controlling your lights or is this only a remark after checking the logs?

Controlling the lights is an issue when this happens, I dont have control of the lights. what i found is to restart the server every time. This will get control of the lights but only till its off line again. Any reasons why the connection is having issues? The light is less than 1m away from the Pi, where HA is installed.

fsaris commented 3 years ago

Any reasons why the connection is having issues? The light is less than 1m away from the Pi, where HA is installed.

It seems to be a Bluetooth device issue. The underling program/service that's used to control you HA/Pi Bluetooth device is not that stable/reliable. See similar with my NUC. But for me, fortunately, the auto connect I build in is enough to setup a new connection. And after a while the old frozen connection is cleaned up or something by the OS.

Maybe the Pi can not handle this as well, probably hardware limitation?

Bottom line, Bluetooth isn't really stable. Sometimes I see a lost connection + reconnect every 5 minutes. Sometimes it runs for a few hours without any issues.

We could try to decrease the interval to check if the lights are still available. Maybe the Bluetooth device can handle it than better. But that will result in slower feedback that a light is offline.

If you would like to test this for yourself just adjust the interval here and restart the integration https://github.com/fsaris/home-assistant-awox/blob/f9e7d258f2103fcbd2b4bb60521bbde7c666b520/custom_components/awox/awox_mesh.py#L36

fsaris commented 3 years ago

Could you also check if you see the bluepy_helper process on 100%? https://github.com/IanHarvey/bluepy/issues/332

To check this:

sfb247 commented 3 years ago

Could you also check if you see the bluepy_helper process on 100%? IanHarvey/bluepy#332

To check this:

  • Log in with ssh on you HA machine
  • run top and check if you see the bluepy_helper process
  • or ps -aux | grep "bluepy_helper"

These lights are garbage.. I would not have them if they didnt come with the flat lol

I don't see any of the processes running on HA machine. Is this a issue why its disconnecting?

JoeyGnarf commented 3 years ago

Could you also check if you see the bluepy_helper process on 100%? IanHarvey/bluepy#332

To check this:

  • Log in with ssh on you HA machine
  • run top and check if you see the bluepy_helper process
  • or ps -aux | grep "bluepy_helper"

I do see the bluepy-helper process hoarding CPU time after some uptime of HA (using top after logging in locally, not the HA docker). This however doens't really seem to cause any problems for the integration, the mesh can still be connected/reconnect and when connected the lights do respond to commands from HA.

I'm currently waiting on a new USB Bluetooth dongle to be delivered, hopefully to eliminate the built-in bluetooth on the RPi4 as being the cause of the many disconnects.

gstefanov commented 3 years ago

Hi, I'll drop an update. I do have a rasp4, and yes, I found out that my rasp4 was running at 100% CPU the other day, and I had 3 bluepy_helper processes started. I stopped HA docker instance and everything came back to normal.

JoeyGnarf commented 3 years ago

A couple of days ago I finally received the Bluetooth dongle. The mesh still disconnects often both when sending commands and when idle, but the amount of time it takes to reconnect to the mesh has improved tremendously. Now it most often only takes a few seconds to reconnect, making it reasonably workable for most automations. I do feel I need to keep a few original remotes handy for the time being, until stability further improves.

I have no clue why the mesh disconnects when idling, but when sending commands I did get the distinct impression that the mesh doens't disconnect when addressing 1 light only. When I address many lights however, e.g. setting all my 14 lights in a scene, it does seem to lead to some disconnects (but not as often as with the integrated BT on the RPi4). Any advice on this?

vdebrist commented 3 years ago

Could you also check if you see the bluepy_helper process on 100%? IanHarvey/bluepy#332

To check this:

  • Log in with ssh on you HA machine
  • run top and check if you see the bluepy_helper process
  • or ps -aux | grep "bluepy_helper"

Hi, after clear istallation bluepy_helper loads 100% of CPU. It works, but unstable and always reconnecting. I tryied to update bluepy/btle.py and bluepy/bluepy-helper.c with all recent updates from the master build. After updating - problem with bluepy-helper and CPU Utilisation was resolved, but awox_mesh is disconnected and cannot connect to any of the bulbs.

P.S. On the Host machine bluetooth works, scans and connects to any of the devices.

2021-05-23 17:15:26 ERROR (MainThread) [custom_components.awox.scanner] Failed: Command 'PATH=/usr/sbin:$PATH; rfkill unblock bluetooth' returned non-zero exit status 1. Traceback (most recent call last): File "/config/custom_components/awox/scanner.py", line 38, in async_find_devices bl = await hass.async_add_executor_job(init) File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, *self.kwargs) File "/config/custom_components/awox/scanner.py", line 34, in init return Bluetoothctl() File "/config/custom_components/awox/bluetoothctl.py", line 19, in init subprocess.check_output("PATH=/usr/sbin:$PATH; rfkill unblock bluetooth", shell=True) File "/usr/local/lib/python3.8/subprocess.py", line 415, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "/usr/local/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'PATH=/usr/sbin:$PATH; rfkill unblock bluetooth' returned non-zero exit status 1. 2021-05-23 17:20:57 ERROR (MainThread) [custom_components.awox.awox_mesh] Error fetching awox data: No device connected

fsaris commented 2 years ago

@JoeyGnarf current main version has again a little improvement regarding reconnecting and retrying queued commands (currently in main will release new version in one of the next days).

Issue still stays that Bluetooth is a little bit unstable. Looks like this is an issue with the mesh itself. I also have this with the official android app when trying to control multiple lights at once. This is also what you encounter when you try to control the lights at once in a scene. But as mentioned in the next version it should work better.

@vdebrist rfkill unblock bluetooth' looks to fail. That's a permission or missing executable