home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.03k stars 29.7k forks source link

Decryption of PDU POST response failed #80131

Closed Roger-Elmer closed 1 year ago

Roger-Elmer commented 1 year ago

The problem

Since setting up my Nanoleaf A19 bulbs in HA with HomeKit Controller, I have noticed that the automations to turn some of the bulbs off are failing sometimes, but a majority of the time it works perfectly. When I check the automation's Trace Timeline, I see the following error message.

Stopped because an error was encountered at October 10, 2022 at 7:55:28 PM (runtime: 0.14 seconds)

Decryption of PDU POST response failed

I also see a more detailed version of this error in the system logs which I've included below.

What version of Home Assistant Core has the issue?

2022.10.3

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Homekit Controller

Link to integration documentation on our website

https://www.home-assistant.io/integrations/homekit_controller/

Diagnostics information

config_entry-homekit_controller-00673b4bcfff7720c3f32c080b738d47.json.txt

Example YAML snippet

Below is a simple automation that produces this error when it attempts to turn off the light.

- id: '1665170969547'
  alias: Laundry Light On/Off
  description: ''
  trigger:
  - platform: state
    entity_id:
    - binary_sensor.laundry_motion
    to: 'on'
    id: presence_on
  - platform: state
    entity_id:
    - binary_sensor.laundry_motion
    to: 'off'
    for:
      hours: 0
      minutes: 1
      seconds: 0
    id: presence_off
  condition: []
  action:
  - choose:
    - conditions:
      - condition: trigger
        id: presence_on
      sequence:
      - service: light.turn_on
        data:
          brightness_pct: 100
          color_name: white
        target:
          entity_id: light.laundry
    - conditions:
      - condition: trigger
        id: presence_off
      sequence:
      - service: light.turn_off
        data: {}
        target:
          entity_id: light.laundry
  mode: single

Anything in the logs that might be useful for us?

Logger: homeassistant.components.automation.laundry_light_off_2
Source: components/homekit_controller/connection.py:616
Integration: Automation (documentation, issues)
First occurred: October 10, 2022 at 7:55:28 PM (2 occurrences)
Last logged: October 10, 2022 at 7:55:28 PM

Laundry Light - Off: Error executing script. Unexpected error for call_service at pos 1: Decryption of PDU POST response failed
While executing automation automation.laundry_light_off_2
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 117, in _decrypt_response
    return self.decrypt(response.payload)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 94, in decrypt
    dec_data = self.recv_ctx.decrypt(
  File "/usr/local/lib/python3.10/site-packages/cryptography/hazmat/primitives/ciphers/aead.py", line 62, in decrypt
    return aead._decrypt(backend, self, nonce, data, [associated_data], 16)
  File "/usr/local/lib/python3.10/site-packages/cryptography/hazmat/backends/openssl/aead.py", line 207, in _decrypt
    raise InvalidTag
cryptography.exceptions.InvalidTag

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 451, in _async_step
    await getattr(self, handler)()
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 684, in _async_call_service_step
    await service_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1738, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1775, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 207, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 931, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 557, in async_handle_light_off_service
    await light.async_turn_off(**filter_turn_off_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/light.py", line 161, in async_turn_off
    await self.async_put_characteristics({CharacteristicsTypes.ON: False})
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/entity.py", line 91, in async_put_characteristics
    return await self._accessory.put_characteristics(payload)
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 616, in put_characteristics
    results = await self.pairing.put_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 202, in put_characteristics
    return await self.connection.write_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 559, in write_characteristics
    pdu_results = await self.enc_ctx.post_all(
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 190, in post_all
    res_pdu = await self.post_bytes(req_pdu)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 176, in post_bytes
    return await self._decrypt_response(response)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 159, in _decrypt_response
    raise EncryptionError("Decryption of PDU POST response failed")
aiohomekit.exceptions.EncryptionError: Decryption of PDU POST response failed

Additional information

No response

homeassistant commented 1 year ago

Hey there @jc2k, @bdraco, mind taking a look at this issue as it has been labeled with an integration (homekit_controller) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)


homekit_controller documentation homekit_controller source (message by IssueLinks)

Jc2k commented 1 year ago

@roysjosh for the thread know how.

HAP over thread uses UDP for comms between the homekit client and the border router, and that means packets can get out of order, duplicated or even dropped completely. But the encryption has a counter (implying it should be in order). The decryption tries to guess when this happens, but it cant always recover. So this is somewhat expected, but we might want to implement retrying if we can do so safely.

Roger-Elmer commented 1 year ago

Thanks @Jc2k, this would be greatly appreciated so that I don't have to write automations that check to make sure the lights are actually off and keep trying until successful.

It sounds to me like what you are saying though is that this could happen with any communication to the bulb and so could happen when turning lights on as well or changing brightness, etc.?

Jc2k commented 1 year ago

Well at this stage we can't rule out that it isn't a bug as well of course, but certainly there is code to try to recover from communication failures at any time yes.

In perfect conditions the packets should all arrive and in order, if they routinely don't then it could be worth checking your network. Making sure there is no packet loss between HA and your border router for example.

Roger-Elmer commented 1 year ago

The network in general seems okay but I will restart everything just to be sure and will report back if I see any improvements. I did some ping tests from HA to the border router and had no packet loss with fairly low latency. I've also got a HomePod Mini situated very close to this particular bulb.

If there is anything else you need from me at any point that might help, let me know. Thanks!

roysjosh commented 1 year ago

I've seen this infrequently myself but it does certainly happen. Most times, at least for my network/gear, it seems as if the Border Router or the A19 lost the association and returns a CoAP 404 before triggering the decryption failure code path. You could check your logs for "CoAP POST returned unexpected code". Ideally aiohomekit should immediately reconnect or bubble up the error in such a way that HA itself retries.

Thanks for the detailed report!

Roger-Elmer commented 1 year ago

@roysjosh Happy to help!

And yes, my log today actually shows quite a lot of those warning messages. Log pasted below shows 105 occurrences over the last 6 hours. Interestingly, I haven’t noticed any issues with the lights today despite those warnings and the decryption errors.

Logger: aiohomekit.controller.coap.connection Source: components/homekit_controller/connection.py:612 First occurred: 1:14:03 PM (105 occurrences) Last logged: 7:45:09 PM

Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1... CoAP POST returned unexpected code <aiocoap.Message at 0x7fb16fce4d60: Type.ACK 4.04 Not Found (MID 18997, token 338c) remote <UDP6EndpointAddress [fd1b:7473:6a6:0:8610:27a:a425:98c7] (locally fdd6:d465:fee7:4a84:a417:5c09:729b:dc2d%enp2s0)>> Decryption failed, desynchronized? Counter=26/27 CoAP POST returned unexpected code <aiocoap.Message at 0x7fb16fce5ff0: Type.ACK 4.04 Not Found (MID 28673, token 95e0) remote <UDP6EndpointAddress [fd1b:7473:6a6:0:df9a:433f:d90f:e0d7] (locally fdd6:d465:fee7:4a84:a417:5c09:729b:dc2d%enp2s0)>> CoAP POST returned unexpected code <aiocoap.Message at 0x7fb16fce5ff0: Type.ACK 4.04 Not Found (MID 9735, token fc9e) remote <UDP6EndpointAddress [fd1b:7473:6a6:0:bb1b:7d01:a748:1187] (locally fdd6:d465:fee7:4a84:a417:5c09:729b:dc2d%enp2s0)>>

daveismith commented 1 year ago

I'm seeing this in the last few days. The only way to get my nanoleaf strips back is to restart home assistant. I can't reload the HomeKit integration.

Logger: aiohomekit.controller.coap.connection
Source: components/homekit_controller/connection.py:612 
First occurred: 6:46:42 PM (24 occurrences) 
Last logged: 8:27:52 PM

Decryption failed, desynchronized? Counter=13/14
Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1...
CoAP POST returned unexpected code <aiocoap.Message at 0x94cfa0e8: Type.ACK 4.04 Not Found (MID 52289, token 8387) remote <UDP6EndpointAddress [fd97:42bc:71e1:0:708:9a9f:558:d680] (locally fdc3:464e:1829:46dc:ea08:f5cd:66b5:a487%eth0)>>
Decryption failed, desynchronized? Counter=46/47
Pair verify timed out

I'll also note that it works for a single command to the strips after a restart and then they become unresponsive. I then see things like what's below. A full hardware restart does seem to rectify the issue (I'm running supervised, so maybe the underlying OS is introducing issues).

Logger: homeassistant.core
Source: components/homekit_controller/connection.py:616
First occurred: 8:23:09 AM (3 occurrences)
Last logged: 8:23:41 AM

Error executing service: <ServiceCall light.turn_on (c:01GFR34GN5X5SDA4KZERTF5SFY): entity_id=['light.nanoleaf_strip_2p8h_nanoleaf_light_strip'], params=color_temp=198>
Error executing service: <ServiceCall light.turn_on (c:01GFR34J6EQFKTH6Y6TT2QRAVH): entity_id=['light.nanoleaf_strip_2p8h_nanoleaf_light_strip'], params=color_temp=153>
Error executing service: <ServiceCall light.turn_off (c:01GFR34K650AVYTVYWN0K58G7A): entity_id=['light.nanoleaf_strip_2p8h_nanoleaf_light_strip'], params=>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 167, in post_bytes
    response = await asyncio.wait_for(
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/core.py", line 1756, in catch_exceptions
    await coro_or_task
  File "/usr/src/homeassistant/homeassistant/core.py", line 1775, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 207, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 931, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 546, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/light.py", line 157, in async_turn_on
    await self.async_put_characteristics(characteristics)
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/entity.py", line 91, in async_put_characteristics
    return await self._accessory.put_characteristics(payload)
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 616, in put_characteristics
    results = await self.pairing.put_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 202, in put_characteristics
    return await self.connection.write_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 559, in write_characteristics
    pdu_results = await self.enc_ctx.post_all(
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 190, in post_all
    res_pdu = await self.post_bytes(req_pdu)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 171, in post_bytes
    raise AccessoryDisconnectedError("Request timeout")
aiohomekit.exceptions.AccessoryDisconnectedError: Request timeout

During startup, I also see:

Logger: homeassistant.components.websocket_api.http.connection
Source: components/homekit_controller/connection.py:616
Integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 8:22:53 AM (1 occurrences)
Last logged: 8:22:53 AM

[2510992192] Request timeout
OSError: [Errno 113] received through errqueue

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 167, in post_bytes
    response = await asyncio.wait_for(
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/usr/local/lib/python3.10/site-packages/aiocoap/protocol.py", line 597, in _run_outer
    await cls._run(app_request, response, weak_observation, protocol, log)
  File "/usr/local/lib/python3.10/site-packages/aiocoap/protocol.py", line 656, in _run
    blockresponse = await blockrequest.response
aiocoap.error.NetworkError: [Errno 113] received through errqueue

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 202, in handle_call_service
    await hass.services.async_call(
  File "/usr/src/homeassistant/homeassistant/core.py", line 1738, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1775, in _execute_service
    await cast(Callable[[ServiceCall], Awaitable[None]], handler.job.target)(
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 207, in handle_service
    await service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 678, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 931, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 715, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/light/__init__.py", line 546, in async_handle_light_on_service
    await light.async_turn_on(**filter_turn_on_params(light, params))
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/light.py", line 157, in async_turn_on
    await self.async_put_characteristics(characteristics)
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/entity.py", line 91, in async_put_characteristics
    return await self._accessory.put_characteristics(payload)
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 616, in put_characteristics
    results = await self.pairing.put_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 202, in put_characteristics
    return await self.connection.write_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 559, in write_characteristics
    pdu_results = await self.enc_ctx.post_all(
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 190, in post_all
    res_pdu = await self.post_bytes(req_pdu)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 171, in post_bytes
    raise AccessoryDisconnectedError("Request timeout")
aiohomekit.exceptions.AccessoryDisconnectedError: Request timeout
Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:198
First occurred: 8:10:18 AM (3 occurrences)
Last logged: 8:12:06 AM

Transaction 0 failed with error 6 (Invalid request
Logger: aiohomekit.controller.coap.connection
Source: components/homekit_controller/connection.py:198
First occurred: 8:11:54 AM (1 occurrences)
Last logged: 8:11:54 AM

Pair verify timed out
daveismith commented 1 year ago

While a reboot initially solved the issue, it seemed to return after about 18 hours. Looking at the mDNS entry, I see the device name has been incremented significantly since yesterday (it's at ... 2802, and it was under 1000 yesterday).

@Roger-Elmer, do you happen to see the mDNS name changing on your devices? I'm wondering if there's network instability happening and causing the device to change it's IP frequently.

Roger-Elmer commented 1 year ago

@daveismith I'm not sure how to look up the mDNS name. Is this something I would look up in HA or in my router? I'm using an Edge Router X.

I am seeing the same issue though where rebooting the host solved the issue initially but I woke up this morning to find most of the thread devices are unavailable again so it sounds like it could be the same thing you are seeing.

Update: I think I figured this out by using a Bonjour browser app on my iPhone. I'll keep an eye on the names over the next couple of days and will report back what I see.

daveismith commented 1 year ago

I use an app called discovery on my phone or Mac. You can also look into how to use the avahi command line tools.

Roger-Elmer commented 1 year ago

Thanks, that’s the app I found and started using. I haven’t seen any name changes yet but next time the bulbs become unavailable I’ll check and report back.

roysjosh commented 1 year ago

I see the same hang about once a week. I've added some debug locally to try to pin it down but it may take a bit since it occurs so rarely for me.

Roger-Elmer commented 1 year ago

Thanks @roysjosh, I hope the debugging goes well.

@daveismith, I have yet to see any mDNS name changes since I started monitoring. I haven't really had much trouble though since installing 2022.10.5 and I'm not sure if that's just a coincidence or not. Since then, the only time I've had to restart my router and then subsequently the host is after I reboot the Apple TV when it's acting as the "Connected" hub. Saw this again this morning after updating it and the HomePod Minis to 16.1. I then saw this again after one of the HomePods randomly took over as the "Connected" hub an hour or so later. This was unusual for me because usually the Apple TV will stay "Connected" unless it gets rebooted so hopefully this is not some new issue popping up with 16.1.

issue-triage-workflows[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.