JoDehli / PyLoxone

Python Loxone binding
Apache License 2.0
162 stars 40 forks source link

Error doing job: Task exception was never retrieved #256

Open tegner23 opened 1 month ago

tegner23 commented 1 month ago

Describe the bug

Losing connection after 2-3 hours running the integration. Also refresh of the integration doesn't fix the issue and a full reboot is necessary to get the entities running again.

Firmware of your Miniserver

14.5.12.7

HomeAssistant install method

Pi5, Hassio

Version of HomeAssistant

2024.3.3

Version of Pyloxone

0.6.3

Update pyloxone

yes

Log

Logger: homeassistant Quelle: custom_components/loxone/api.py:325 Integration: PyLoxone (Dokumentation, Probleme) Erstmals aufgetreten: 20:44:32 (2 Vorkommnisse) Zuletzt protokolliert: 21:24:39

Error doing job: Task exception was never retrieved Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 1302, in close_connection await self.transfer_data_task File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 959, in transfer_data message = await self.read_message() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 1029, in read_message frame = await self.read_data_frame(max_size=self.max_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 1104, in read_data_frame frame = await self.read_frame(max_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 1161, in read_frame frame = await Frame.read( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/websockets/legacy/framing.py", line 68, in read data = await reader(2) ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/asyncio/streams.py", line 752, in readexactly await self._wait_for_data('readexactly') File "/usr/local/lib/python3.12/asyncio/streams.py", line 545, in _wait_for_data await self._waiter asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/config/custom_components/loxone/api.py", line 325, in keep_alive await self._ws.send("keepalive") File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 635, in send await self.ensure_open() File "/usr/local/lib/python3.12/site-packages/websockets/legacy/protocol.py", line 935, in ensure_open raise self.connection_closed_exc() websockets.exceptions.ConnectionClosedError: sent 1011 (unexpected error) keepalive ping timeout; no close frame received

JoDehli commented 1 month ago

@tegner23 a refresh is not working if the connection is lost.

I do not know why this error occurs. Is it a gen2?

JoDehli commented 1 month ago

@tegner23 I really.think.it is a home assistant problem. https://github.com/home-assistant/core/issues/114196

There are more issues with other websocket integrations. On loxone and pyloxone changed nothing.

williamsjou commented 1 month ago

I had the same issue and restarting home assistant made it work again, I don't know how long it will work! my loxone is generation 1

tegner23 commented 1 month ago

I use a Gen2 Server.

Is there a possibility to use the keepalive ping in an automation that triggers every hour or does this not have an impact? The workaround I use at the moment is a total reboot of the hostsystem every 6 hours but this doesn't help either as this would need to run every 2 to 3 hours...

JoDehli commented 1 month ago

The problem is not the keep alive ping the problem is that you lose the connection. You should try to make the connection stable. do you have the miniserver on a different network? If the connection is disconnected once you have to restart home assistant. That is how it is at the moment.

MephistoJB commented 1 month ago

Are there any news regarding this topic? I have the exact same problem. The linked problem in the core above states, that it should be fixed in 2024.4. I have 2024.4.3 and the error is still there. Is there maybe a chance to have a switch to manually reconnect without rebooting? This could be used in a automation.

Some details about my system.

Gen 1 Same network Pi 5 Hassio

JoDehli commented 1 month ago

@MephistoJB this error accours only if you connection is unstable. I think with the current implementation it will not be possible to improve that.

MephistoJB commented 1 month ago

Hi @JoDehli thx for the reply. I have difficulties to understand what "unstable" means. HA and Loxone are in the same network and both have wired access. Do you have an idea what could cause such an instability? Also this problem started just a few weeks ago. I am lost here 😅

JoDehli commented 1 month ago

@MephistoJB I mean the websocket connection is interrupted for a small amount of time. If this happened normally the complete connection routine should be start again. This is not the case how it is implemented at the moment. I think i will not implement it because I do not have the problem and the most important reason I have not time. Sorry.

I do not exactly know why this problems are more often after the last update if homeassistant. Maybe because they changed the version of the websocket library. Ok

Elijen commented 1 month ago

I've been having this issue for months (maybe more than a year). It always without exception occurs when there is a power (and maybe network) outage causing Miniserver to restart. It works fine after HA restart.

JoDehli commented 1 month ago

Yes this is a problem and there is at the moment no solution for this. The websocket connection does not report a connection lost so there is no way to reconnect.

Elijen commented 1 month ago

@JoDehli Shouldn't I be able to catch the Exception that is logged by HA?

MephistoJB commented 1 month ago

For me Happens Independent of a power outage. It just happens every few hours.

But I found a possible workaround. I created a dummyswitch in Loxone and an automation in HA to switch a dummyswitch every minute. This seems to be stable since 48h for now. I will now try to reduce the automation trigger times and observe what happens.

tegner23 commented 1 month ago

For me the workaround with total reboot of the hostsystem every 4 hours works fine. In my case I do not need any further steps within Loxone Config. If the intervall is longer, there is a risk that the failure happens again.

@MephistoJB, Please let us know as soon as you have new information

JoDehli commented 4 weeks ago

@tegner23 @MephistoJB @williamsjou

I tried to catch the error in the new 0.6.5 pre-release. For it is difficult because on my small installation I have really not connection problems. I tried to fix is so that you do not need special hacks.

In general I know that the current implementation is not perfect. I started this project for my own an as I said I have a very small installation. I tried to implement as much as I can for other users but my time is limited. Especially because it is even more time consuming if you do not have the devices in your own installation.

I work currently on a complete rewrite of this project. I have still some work to do but when it is finished it should more robust (hopefully). But until now I hope this release fixes the strange problems which appeared 1 HA release ago.

MephistoJB commented 4 weeks ago

@JoDehli your work is much appreciated. Thx for that. I can imagine that it is a tough job.

Thx also for the new release. I will try it right now and report.

Having said that. How can I install the new release? It doesn't show up in HACS

CodeMartn commented 4 weeks ago

I do not exactly know why this problems are more often after the last update if homeassistant. Maybe because they changed the version of the websocket library. Ok

I am having this issue ever since I use PyLoxone (~1.5 years), using Pi4, Gen 1 Miniserver, always on latest versions. My Pi4 is on WLAN, while Miniserver is on LAN. I will try to use LAN on the Pi4 as well and report, if this makes the link more stable.

tegner23 commented 3 weeks ago

@JoDehli It seems like you did a pretty good job! :) Since updating to 0.6.4 (20.04.2024) I do not need any reboots to keep the integration alive. I have now created a few automations running every hour to check if the failure shows up again.

Thank you very much for your effort. Let me know how you would like to continue with that bug.

JoDehli commented 3 weeks ago

@tegner23 thanks for your feedback. It would be great if you can provide error logs if a error raises and causes a reconnect. And I am also interested how often this happens. I leave the Issue open for a little while. Maybe the other users can also provide some feedback.

I try to finish the new implementation. There I also have to implement a stable reconnect mechanism for such problems. If it is finished maybe you can test it.

MephistoJB commented 3 weeks ago

Sorry for the Late answer. To be honest I forgot to check ha in the last days/weeks, since there have been zero problems. This is a very good sign, isn't it?

Thanks a lot. Good job.

j-nordt commented 1 week ago

I can report that the error does not occur in my setup on 0.6.6 but does occur frequently when I install 0.6.7. Running on a Mini Server 2 with SW version 14.5.12.7

I want to thank you for all the work in PyLoxone, it is a great integration.

JoDehli commented 1 week ago

@j-nordt the versions are the same for the connection. Are you sure that it is not a other error?