Closed Roger-Elmer closed 1 year ago
Hey there @jc2k, @bdraco, mind taking a look at this issue as it has been labeled with an integration (homekit_controller
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
homekit_controller documentation homekit_controller source (message by IssueLinks)
@roysjosh for the thread know how.
HAP over thread uses UDP for comms between the homekit client and the border router, and that means packets can get out of order, duplicated or even dropped completely. But the encryption has a counter (implying it should be in order). The decryption tries to guess when this happens, but it cant always recover. So this is somewhat expected, but we might want to implement retrying if we can do so safely.
Thanks @Jc2k, this would be greatly appreciated so that I don't have to write automations that check to make sure the lights are actually off and keep trying until successful.
It sounds to me like what you are saying though is that this could happen with any communication to the bulb and so could happen when turning lights on as well or changing brightness, etc.?
Well at this stage we can't rule out that it isn't a bug as well of course, but certainly there is code to try to recover from communication failures at any time yes.
In perfect conditions the packets should all arrive and in order, if they routinely don't then it could be worth checking your network. Making sure there is no packet loss between HA and your border router for example.
The network in general seems okay but I will restart everything just to be sure and will report back if I see any improvements. I did some ping tests from HA to the border router and had no packet loss with fairly low latency. I've also got a HomePod Mini situated very close to this particular bulb.
If there is anything else you need from me at any point that might help, let me know. Thanks!
I've seen this infrequently myself but it does certainly happen. Most times, at least for my network/gear, it seems as if the Border Router or the A19 lost the association and returns a CoAP 404 before triggering the decryption failure code path. You could check your logs for "CoAP POST returned unexpected code". Ideally aiohomekit should immediately reconnect or bubble up the error in such a way that HA itself retries.
Thanks for the detailed report!
@roysjosh Happy to help!
And yes, my log today actually shows quite a lot of those warning messages. Log pasted below shows 105 occurrences over the last 6 hours. Interestingly, I haven’t noticed any issues with the lights today despite those warnings and the decryption errors.
Logger: aiohomekit.controller.coap.connection Source: components/homekit_controller/connection.py:612 First occurred: 1:14:03 PM (105 occurrences) Last logged: 7:45:09 PM
Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1... CoAP POST returned unexpected code <aiocoap.Message at 0x7fb16fce4d60: Type.ACK 4.04 Not Found (MID 18997, token 338c) remote <UDP6EndpointAddress [fd1b:7473:6a6:0:8610:27a:a425:98c7] (locally fdd6:d465:fee7:4a84:a417:5c09:729b:dc2d%enp2s0)>> Decryption failed, desynchronized? Counter=26/27 CoAP POST returned unexpected code <aiocoap.Message at 0x7fb16fce5ff0: Type.ACK 4.04 Not Found (MID 28673, token 95e0) remote <UDP6EndpointAddress [fd1b:7473:6a6:0:df9a:433f:d90f:e0d7] (locally fdd6:d465:fee7:4a84:a417:5c09:729b:dc2d%enp2s0)>> CoAP POST returned unexpected code <aiocoap.Message at 0x7fb16fce5ff0: Type.ACK 4.04 Not Found (MID 9735, token fc9e) remote <UDP6EndpointAddress [fd1b:7473:6a6:0:bb1b:7d01:a748:1187] (locally fdd6:d465:fee7:4a84:a417:5c09:729b:dc2d%enp2s0)>>
I'm seeing this in the last few days. The only way to get my nanoleaf strips back is to restart home assistant. I can't reload the HomeKit integration.
Logger: aiohomekit.controller.coap.connection
Source: components/homekit_controller/connection.py:612
First occurred: 6:46:42 PM (24 occurrences)
Last logged: 8:27:52 PM
Decryption failed, desynchronized? Counter=13/14
Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1...
CoAP POST returned unexpected code <aiocoap.Message at 0x94cfa0e8: Type.ACK 4.04 Not Found (MID 52289, token 8387) remote <UDP6EndpointAddress [fd97:42bc:71e1:0:708:9a9f:558:d680] (locally fdc3:464e:1829:46dc:ea08:f5cd:66b5:a487%eth0)>>
Decryption failed, desynchronized? Counter=46/47
Pair verify timed out
I'll also note that it works for a single command to the strips after a restart and then they become unresponsive. I then see things like what's below. A full hardware restart does seem to rectify the issue (I'm running supervised, so maybe the underlying OS is introducing issues).
Logger: homeassistant.core
Source: components/homekit_controller/connection.py:616
First occurred: 8:23:09 AM (3 occurrences)
Last logged: 8:23:41 AM
Error executing service: <ServiceCall light.turn_on (c:01GFR34GN5X5SDA4KZERTF5SFY): entity_id=['light.nanoleaf_strip_2p8h_nanoleaf_light_strip'], params=color_temp=198>
Error executing service: <ServiceCall light.turn_on (c:01GFR34J6EQFKTH6Y6TT2QRAVH): entity_id=['light.nanoleaf_strip_2p8h_nanoleaf_light_strip'], params=color_temp=153>
Error executing service: <ServiceCall light.turn_off (c:01GFR34K650AVYTVYWN0K58G7A): entity_id=['light.nanoleaf_strip_2p8h_nanoleaf_light_strip'], params=>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 167, in post_bytes
response = await asyncio.wait_for(
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/core.py", line 1756, in catch_exceptions
await coro_or_task
File "/usr/src/homeassistant/homeassistant/core.py", line 1775, in _execute_service
await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 207, in handle_service
await service.entity_service_call(
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
future.result() # pop exception if have
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 931, in async_request_call
await coro
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
await result
File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 546, in async_handle_light_on_service
await light.async_turn_on(**filter_turn_on_params(light, params))
File "/usr/src/homeassistant/homeassistant/components/homekit_controller/light.py", line 157, in async_turn_on
await self.async_put_characteristics(characteristics)
File "/usr/src/homeassistant/homeassistant/components/homekit_controller/entity.py", line 91, in async_put_characteristics
return await self._accessory.put_characteristics(payload)
File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 616, in put_characteristics
results = await self.pairing.put_characteristics(characteristics)
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 202, in put_characteristics
return await self.connection.write_characteristics(characteristics)
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 559, in write_characteristics
pdu_results = await self.enc_ctx.post_all(
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 190, in post_all
res_pdu = await self.post_bytes(req_pdu)
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 171, in post_bytes
raise AccessoryDisconnectedError("Request timeout")
aiohomekit.exceptions.AccessoryDisconnectedError: Request timeout
During startup, I also see:
Logger: homeassistant.components.websocket_api.http.connection
Source: components/homekit_controller/connection.py:616
Integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 8:22:53 AM (1 occurrences)
Last logged: 8:22:53 AM
[2510992192] Request timeout
OSError: [Errno 113] received through errqueue
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 167, in post_bytes
response = await asyncio.wait_for(
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/usr/local/lib/python3.10/site-packages/aiocoap/protocol.py", line 597, in _run_outer
await cls._run(app_request, response, weak_observation, protocol, log)
File "/usr/local/lib/python3.10/site-packages/aiocoap/protocol.py", line 656, in _run
blockresponse = await blockrequest.response
aiocoap.error.NetworkError: [Errno 113] received through errqueue
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 202, in handle_call_service
await hass.services.async_call(
File "/usr/src/homeassistant/homeassistant/core.py", line 1738, in async_call
task.result()
File "/usr/src/homeassistant/homeassistant/core.py", line 1775, in _execute_service
await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 207, in handle_service
await service.entity_service_call(
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
future.result() # pop exception if have
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 931, in async_request_call
await coro
File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
await result
File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 546, in async_handle_light_on_service
await light.async_turn_on(**filter_turn_on_params(light, params))
File "/usr/src/homeassistant/homeassistant/components/homekit_controller/light.py", line 157, in async_turn_on
await self.async_put_characteristics(characteristics)
File "/usr/src/homeassistant/homeassistant/components/homekit_controller/entity.py", line 91, in async_put_characteristics
return await self._accessory.put_characteristics(payload)
File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 616, in put_characteristics
results = await self.pairing.put_characteristics(characteristics)
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 202, in put_characteristics
return await self.connection.write_characteristics(characteristics)
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 559, in write_characteristics
pdu_results = await self.enc_ctx.post_all(
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 190, in post_all
res_pdu = await self.post_bytes(req_pdu)
File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 171, in post_bytes
raise AccessoryDisconnectedError("Request timeout")
aiohomekit.exceptions.AccessoryDisconnectedError: Request timeout
Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:198
First occurred: 8:10:18 AM (3 occurrences)
Last logged: 8:12:06 AM
Transaction 0 failed with error 6 (Invalid request
Logger: aiohomekit.controller.coap.connection
Source: components/homekit_controller/connection.py:198
First occurred: 8:11:54 AM (1 occurrences)
Last logged: 8:11:54 AM
Pair verify timed out
While a reboot initially solved the issue, it seemed to return after about 18 hours. Looking at the mDNS entry, I see the device name has been incremented significantly since yesterday (it's at ... 2802, and it was under 1000 yesterday).
@Roger-Elmer, do you happen to see the mDNS name changing on your devices? I'm wondering if there's network instability happening and causing the device to change it's IP frequently.
@daveismith I'm not sure how to look up the mDNS name. Is this something I would look up in HA or in my router? I'm using an Edge Router X.
I am seeing the same issue though where rebooting the host solved the issue initially but I woke up this morning to find most of the thread devices are unavailable again so it sounds like it could be the same thing you are seeing.
Update: I think I figured this out by using a Bonjour browser app on my iPhone. I'll keep an eye on the names over the next couple of days and will report back what I see.
I use an app called discovery on my phone or Mac. You can also look into how to use the avahi command line tools.
Thanks, that’s the app I found and started using. I haven’t seen any name changes yet but next time the bulbs become unavailable I’ll check and report back.
I see the same hang about once a week. I've added some debug locally to try to pin it down but it may take a bit since it occurs so rarely for me.
Thanks @roysjosh, I hope the debugging goes well.
@daveismith, I have yet to see any mDNS name changes since I started monitoring. I haven't really had much trouble though since installing 2022.10.5 and I'm not sure if that's just a coincidence or not. Since then, the only time I've had to restart my router and then subsequently the host is after I reboot the Apple TV when it's acting as the "Connected" hub. Saw this again this morning after updating it and the HomePod Minis to 16.1. I then saw this again after one of the HomePods randomly took over as the "Connected" hub an hour or so later. This was unusual for me because usually the Apple TV will stay "Connected" unless it gets rebooted so hopefully this is not some new issue popping up with 16.1.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
The problem
Since setting up my Nanoleaf A19 bulbs in HA with HomeKit Controller, I have noticed that the automations to turn some of the bulbs off are failing sometimes, but a majority of the time it works perfectly. When I check the automation's Trace Timeline, I see the following error message.
Stopped because an error was encountered at October 10, 2022 at 7:55:28 PM (runtime: 0.14 seconds)
Decryption of PDU POST response failed
I also see a more detailed version of this error in the system logs which I've included below.
What version of Home Assistant Core has the issue?
2022.10.3
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Homekit Controller
Link to integration documentation on our website
https://www.home-assistant.io/integrations/homekit_controller/
Diagnostics information
config_entry-homekit_controller-00673b4bcfff7720c3f32c080b738d47.json.txt
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
No response