hellqvio86 / home_assistant_casambi

Home assistant Integration for Casambi Cloud lights
MIT License
25 stars 9 forks source link

Units not online after some time #3

Closed allardt89 closed 3 years ago

allardt89 commented 3 years ago

Within 24 hours of starting the addon my only two lights are said to be offline with an error as follows: unit is not online: <Casambi light Mito sospeso: unit=<Unit Mito sospeso: unit_id=1 address=xxx value=0.0 state=off online=False network_id=xxx wire_id=9 websocket=<WSClient state=running wire_id=9> > unit is not online: <Casambi light Mito sospeso: unit=<Unit Mito sospeso: unit_id=2 address=xxx value=0.0 state=off online=False network_id=e7cJ8Z6bpHjR0oCzbo66vU6MJ0pl0Ryh wire_id=9 websocket=<WSClient state=running wire_id=9> > unit is not online: <Casambi light Mito sospeso: unit=<Unit Mito sospeso: unit_id=1 address= xxx value=0.25098039215686274 state=on online=False network_id=xxx wire_id=9 websocket=<WSClient state=running wire_id=9> > unit is not online: <Casambi light Mito sospeso: unit=<Unit Mito sospeso: unit_id=2 address=xxx value=0.25098039215686274 state=on online=False network_id=xxx wire_id=9 websocket=<WSClient state=running wire_id=9> >

Restarting HA resolves the issue, only for it to reoccur after some hours. I can send debug logs in pm.

hellqvio86 commented 3 years ago

Hi, Please send some more logs, what I can see is that Casambis api is reporting that your device is offline "unit is not online", that can either be, problem with the device or most likely problem with the Casambi gateway (I usally have to restart the app from time to time, running on Android), the state is send over the websocket.

But why you need to restart HA is a separate problem, that's most likely because of connection problem with the websocket and I don't see the problem in the provided logs. I have experienced this problem myself but never had logs logs for debugging it.

The addon will try to do a reconnection, either that's buggy or it needs to be more persistent (try more).

Best Regards Olof Hellqvist

allardt89 commented 3 years ago

Hi, Sorry for the late reply. After reading your reply I've tried to be patient and not restart HA and wait for the Gateway to connect again. This has been the case for the last two weeks. The units are reported to be offline for several times per day, but looking at the logs it's always just a matter of seconds. The frequency differs from day to day though, sometimes just a couple of times a day, sometimes multiple times per hour.

So, I'm starting to believe the need to restart HA was because of my Apple HomeKit integration, which seems to be working fine now. It has been running OK since 27-11. My Gateway is an old iPad mini I've also setup to be my Homekit-hub.

Sorry for making it so vague. If you still need any logs, I'm happy to provide them. But seeing that the auto reconnect now works fine, I doubt they will tell you anything.

Best regards

hellqvio86 commented 3 years ago

The homeassistant plugin does definitely need to be more robust, but I need logs so I can try to reproduce the scenario.

If it comes and goes its probably the casambi gateway way. But its also possible that the websocket terminates for what ever reason and the home assistant plugin should handle that and try to reconnect gracefully (with increasing delay) and that can be improved on the home assistant side, but I lack logs currently. I have not experience the issue myself for a while (I have restarted ha due to upgrades and so on) 🙂

robsonke commented 3 years ago

A bit offtopic but what do you guys use as a gateway? An old phone or something?

hellqvio86 commented 3 years ago

I have a Nexus 6P currently, I have also used a Samsung galaxy 8 s, but the nexus has better logging since its rooted and both devices are "unstable", crashed from time to time, maybe every month so not a biggy.

allardt89 commented 3 years ago

I'm using an old iPad mini also setup as my Homekit Hub.

I have a couple of days of logs for you, but I'm not comfortable sharing these in public. How can I share them with you only, Olof?

Interesting to see is that after some days the plugin has stopped reconnecting, so the devices did not come online again.

hellqvio86 commented 3 years ago

You can mail them to me (olof.hellqvist@gmail.com) or send me a secure link to a file service like Dropbox.

Have you enabled debugging?

​logger:   default: info   logs:     homeassistant.components.casambi: debug     custom_components.casambi: debug

Please remove your password, api key and its also good to remove user name, I don't need it.

//Olof

hellqvio86 commented 3 years ago

I have a released a new version with better reconnect support, the underlying library has been updated and its now handling reconnect when I drop the connection to door.casambi.com.

hellqvio86 commented 3 years ago

@allardt89 I see that your gateway is very unstable, its dropping connection a lot, it can either be the Internet connection or the device itself (Bluetooth stack etc.).

But I cannot see anything regarding the websocket, please upgrade to the latest version of this integration and see if the problem persists.

robsonke commented 3 years ago

I recently started using the component too, but I seem to have disconnects too. The websocket connection is often down/broken. The below stacktrace occurs when I try to turn a light on but also the current state of the entities isn't synced.

File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 135, in handle_call_service await hass.services.async_call( File "/usr/src/homeassistant/homeassistant/core.py", line 1445, in async_call task.result() File "/usr/src/homeassistant/homeassistant/core.py", line 1480, in _execute_service await handler.job.target(service_call) File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 204, in handle_service await self.hass.helpers.service.entity_service_call( File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 593, in entity_service_call future.result() # pop exception if have File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 664, in async_request_call await coro File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 630, in _handle_entity_call await result File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 230, in async_handle_light_on_service await light.async_turn_on(**params) File "/config/custom_components/casambi/light.py", line 201, in async_turn_on await self.unit.turn_unit_on() File "/usr/local/lib/python3.8/site-packages/aiocasambi/units.py", line 372, in turn_unit_on await self._web_sock.send_messge(message) File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 89, in send_messge await self.web_sock.send_str(json.dumps(message)) File "/usr/local/lib/python3.8/site-packages/aiohttp/client_ws.py", line 150, in send_str await self._writer.send(data, binary=False, compress=compress) File "/usr/local/lib/python3.8/site-packages/aiohttp/http_websocket.py", line 687, in send await self._send_frame(message, WSMsgType.TEXT, compress) File "/usr/local/lib/python3.8/site-packages/aiohttp/http_websocket.py", line 598, in _send_frame raise ConnectionResetError("Cannot write to closing transport") ConnectionResetError: Cannot write to closing transport

hellqvio86 commented 3 years ago

@robsonke Thanks for the stacktrace, something is fishy with the signaling, so I have added a safeguard and the underlying library will try to reconnect when websocket is broken and message is trying to get sent.

I have released a new version.

robsonke commented 3 years ago

That's quick, I installed it and during startup I noticed the following errors:

` Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 126, in running self.session_handler_callback(SIGNAL_DATA) File "/usr/local/lib/python3.8/site-packages/aiocasambi/controller.py", line 180, in session_handler self.callback(SIGNAL_DATA, new_items) File "/config/custom_components/casambi/light.py", line 261, in signalling_callback UNITS[key].process_update(value) File "/config/custom_components/casambi/light.py", line 185, in process_update self.async_schedule_update_ha_state(True) File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 438, in async_schedule_update_ha_state assert self.hass is not None AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 144, in running self.state = STATE_DISCONNECTED File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 61, in state self.session_handler_callback(SIGNAL_CONNECTION_STATE) File "/usr/local/lib/python3.8/site-packages/aiocasambi/controller.py", line 185, in session_handler self.callback(SIGNAL_CONNECTION_STATE, self.websocket.state) File "/config/custom_components/casambi/light.py", line 270, in signalling_callback hass = item.hass AttributeError: 'str' object has no attribute 'hass' `

It's probably related to this vague one too: ` Logger: aiocasambi.websocket Source: /usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py:143 First occurred: 7:50:51 (1 occurrences) Last logged: 7:50:51

websocket: Unexpected error `

hellqvio86 commented 3 years ago

Hm, thats the messy connection error handling in Homeassistant, I have not seen the problem, could you provide the log message before the exception? line: 258 _LOGGER.debug(f"signalling_callback signal: {signal} data: {data}")

I have released a new version (v0.06) of the plugin with better error handling, but the reconnection needs to be instantiated from homeassistant, so I need some more information in order to iron this bug out.

robsonke commented 3 years ago

Sure:

2020-12-28 15:25:13 DEBUG (MainThread) [custom_components.casambi.light] signalling_callback signal: state data: starting
2020-12-28 15:25:13 DEBUG (MainThread) [custom_components.casambi.light] signalling_callback signal: state data: running
2020-12-28 15:25:13 DEBUG (MainThread) [custom_components.casambi.light] signalling_callback signal: data data: { all my units }
2020-12-28 15:25:13 ERROR (MainThread) [aiocasambi.websocket] websocket: Unexpected error
2020-12-28 15:25:13 DEBUG (MainThread) [custom_components.casambi.light] signalling_callback signal: state data: disconnected
2020-12-28 15:25:13 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 145, in running
    raise err
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 126, in running
    self.session_handler_callback(SIGNAL_DATA)
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/controller.py", line 180, in session_handler
    self.callback(SIGNAL_DATA, new_items)
  File "/config/custom_components/casambi/light.py", line 261, in signalling_callback
    UNITS[key].process_update(value)
  File "/config/custom_components/casambi/light.py", line 185, in process_update
    self.async_schedule_update_ha_state(True)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 438, in async_schedule_update_ha_state
    assert self.hass is not None
AssertionError

I hope it's not related to anything at my end, but if it is, it would be good if we improve the error messages.

hellqvio86 commented 3 years ago

I need a better test environment for this, so I can reproduce timeouts in homeassistant :)

Its a generic try, that catches the exception, I have updated the addon with a better error message (release v0.07).

robsonke commented 3 years ago

No problem, we'll help and I'll try to setup a local environment too, but a bit busy with finishing up moving into our new house.

With release 0.07:

2020-12-28 17:20:50 ERROR (MainThread) [aiocasambi.websocket] websocket: Unexpected error: < type="<class 'AssertionError'>">
2020-12-28 17:20:50 ERROR (MainThread) [aiocasambi.websocket] An unknown exception was thrown!
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 126, in running
    self.session_handler_callback(SIGNAL_DATA)
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/controller.py", line 180, in session_handler
    self.callback(SIGNAL_DATA, new_items)
  File "/config/custom_components/casambi/light.py", line 261, in signalling_callback
    UNITS[key].process_update(value)
  File "/config/custom_components/casambi/light.py", line 185, in process_update
    self.async_schedule_update_ha_state(True)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 438, in async_schedule_update_ha_state
    assert self.hass is not None
AssertionError
2020-12-28 17:20:50 DEBUG (MainThread) [custom_components.casambi.light] signalling_callback signal: state data: disconnected
2020-12-28 17:20:50 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 146, in running
    raise err
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 126, in running
    self.session_handler_callback(SIGNAL_DATA)
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/controller.py", line 180, in session_handler
    self.callback(SIGNAL_DATA, new_items)
  File "/config/custom_components/casambi/light.py", line 261, in signalling_callback
    UNITS[key].process_update(value)
  File "/config/custom_components/casambi/light.py", line 185, in process_update
    self.async_schedule_update_ha_state(True)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 438, in async_schedule_update_ha_state
    assert self.hass is not None
AssertionError
hellqvio86 commented 3 years ago

The problem is that the addon is not interacting with the home assistant event loop in the right way, its my first asyncio project and it shows ;)

I have removed the "hack" that is doing the retries and its done in the library itself, but I need to setup a local development environment so I can troubleshoot this.

Latest working version is v0.11

robsonke commented 3 years ago

Updated but no luck yet. I'll try to work on it too in next days.

hellqvio86 commented 3 years ago

Is it the same exception? I removed the code that I though caused it 🙂

Working on installing virtualbox on a older - Mac, but experiencing some problem with the latest os version...

robsonke commented 3 years ago

Yes, I was a bit surprised too. But the folder seems updated:

Logger: aiocasambi.websocket
Source: helpers/entity.py:438
First occurred: 8:49:11 (1 occurrences)
Last logged: 8:49:11

An unknown exception was thrown!
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/websocket.py", line 126, in running
    self.session_handler_callback(SIGNAL_DATA)
  File "/usr/local/lib/python3.8/site-packages/aiocasambi/controller.py", line 180, in session_handler
    self.callback(SIGNAL_DATA, new_items)
  File "/config/custom_components/casambi/light.py", line 262, in signalling_callback
    UNITS[key].process_update(value)
  File "/config/custom_components/casambi/light.py", line 186, in process_update
    self.async_schedule_update_ha_state(True)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 438, in async_schedule_update_ha_state
    assert self.hass is not None
AssertionError
hellqvio86 commented 3 years ago

I have written some ansible roles for setting up a test environment and I have worked out some of the bugs, proper reconnection is still a thing that needs to be worked out.

But all lights will now be disabled if the websocket connection breaks, v 0.13 is the latest version.

hellqvio86 commented 3 years ago

Reconnect works for me now with v0.14. Casambi has also updated their API, they have a new interesting function for getting unit/network state Request network state, so we actually don't have to wait for their end to send the state, it can be requested.

I will probably add that to the API/home assistant integration and clean up/refactoring is needed.

hellqvio86 commented 3 years ago

I have done some refactoring, fixed some bugs and done some linting, the latest version is v0.15.

But I have not run into the AssertionError yet.

robsonke commented 3 years ago

That's gone with me now too, no more errors in the logs but unfortunately no connection either (it seems). Clicking a light does nothing, I'll dig into it. Edit: I somehow think that I have an issue with my gateway. Will try to do some tests directly against the api and see if that works.

hellqvio86 commented 3 years ago

Found a bunch of more errors, working on fixing them.

robsonke commented 3 years ago

Nice. BTW, I have some switches as a unit too, maybe that causes issues? As so far I only noticed light entities.

hellqvio86 commented 3 years ago

I finally have reconnect working now! :) I had to restructure the python api a bit, the bad is Casambi have a long timeout and my strategy is to setup a new websocket and forget about the old websocket on every network error, so I get spammed a lot with state changes :)

I don't own a Casambi switch, I only have dimmers that my house developer installed, I use zigbee, zwave, tasmota and local tuya for my switches. But thats why they doesn't show, the integration only support LightEntity currently.

hellqvio86 commented 3 years ago

The latest "stable" release is 0.16

hellqvio86 commented 3 years ago

But I guess switches are treated by the Casambi api as "units" and I threat all "units" as "lights" currently in the integration ;)

I don't really know how to modulate "scenes" in homeassistant, I have started to work on the support on the Python api side for them, but its not finished since I do all my automations in homeassistant and Node-Red

hellqvio86 commented 3 years ago

And yes that will cause problems, with switches, it needs to be debugged and added properly, they should be Switch Entity in Home assistant: https://developers.home-assistant.io/docs/core/entity/switch/ .

robsonke commented 3 years ago

Nah not completely sure, since it's mostly pulse switches, I read something on binary sensors too. Since switches have state in hass but we can test that.

I guess switches are treated by the Casambi api as "units" and I threat all "units" as "lights" currently in the integration ;)

Yes I read that in the code. I'll look into this later, I dont think it will cause the lights not to function.

hellqvio86 commented 3 years ago

oem, model and type is accessable by the API, I think its easliest to let the aiocasambi api separate units into switches and lights.

hellqvio86 commented 3 years ago

Fixed in version 0.16.