home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
74.21k stars 31.16k forks source link

Homekit Controller - Async sometimes not activated #33511

Closed MartianMH1 closed 4 years ago

MartianMH1 commented 4 years ago

The problem

I have an Aquara Hub and a Doorsensor, for which the async implementation is amazing. But I have the problem, that the HomekitController doesn't always activate it. So sometimes I have the normal sync interval of 1' for my Door-Sensor and when I'm lucky async is activated and HA gets State-Changes almost instantly (~1s). When I check it via the Aqara-App i can see the Hub itself sees the changes, but the new state is not reflected in HA.

Log-File:

2020-03-31 19:48:37 INFO (MainThread) [homeassistant.helpers.entity_registry] Registered new binary_sensor.homekit_controller entity: binary_sensor.door_and_window_sensor_106a
2020-03-31 19:48:37 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.homekit_controller
2020-03-31 19:48:37 INFO (MainThread) [homeassistant.helpers.entity_registry] Registered new sensor.homekit_controller entity: sensor.temperature_and_humidity_sensor_b5bb_temperature
2020-03-31 19:48:37 INFO (MainThread) [homeassistant.helpers.entity_registry] Registered new sensor.homekit_controller entity: sensor.temperature_and_humidity_sensor_b5bb_humidity
2020-03-31 19:48:37 INFO (MainThread) [homeassistant.helpers.entity_registry] Registered new sensor.homekit_controller entity: sensor.temperature_and_humidity_sensor_0edb_temperature
2020-03-31 19:48:37 INFO (MainThread) [homeassistant.helpers.entity_registry] Registered new sensor.homekit_controller entity: sensor.temperature_and_humidity_sensor_0edb_humidity
2020-03-31 19:48:43 INFO (MainThread) [aiohomekit.controller.ip.pairing] This device does not support push, so only polling operations will be supported during this session 

2020-04-01 23:00:56 INFO (MainThread) [aiohomekit.controller.ip.pairing] This device does not support push, so only polling operations will be supported during this session

Without it async seems to be activated.

Is there something I can do to make the async-mode permanent? How does the integration "know" if async is supported by the device?

BR

Environment

Problem-relevant configuration.yaml

Added via the Integration-Option in the GUI, so no manual configuration from my side

Traceback/Error logs

Additional information

probot-home-assistant[bot] commented 4 years ago

Hey there @Jc2k, mind taking a look at this issue as its been labeled with a integration (homekit_controller) you are listed as a codeowner for? Thanks!

Jc2k commented 4 years ago

So this is a workaround for Tado devices that is intermittently tripping for you as well.

With Tado devices the "subscribe to event" message breaks the connection every single time. At the moment we haven't been able to trace what is different about the messages we send vs the messages iOS itself sends. As this broke devices that had been working in 0.106.0 I added a quick and simple safeguard - if we detect a connection reset after setting up a secure connection but before the first poll happens (i.e. the window where we are subscribing to changes) then events are disabled until the next power cycle. This allows Tado to work as in 0.106.0 and most other devices to get event support.

So to make it permanent there are only 2 things we can do:

Ideally, we'd do both.

It's late here now but in principal if I asked you to apply a change to some code inside the HA container, would you be able to do it?

MartianMH1 commented 4 years ago

Hi Jc2k

Yes of course, I'd love to help where I can.

Jc2k commented 4 years ago

Hi @MartianMH1 - we we've found the root cause for the Tado issue - the enable events endpoint does not tolerate spaces in JSON payloads! So step (1) on my list will get done probably in 0.109.0 at this point (need to make sure it doesnt cause any regressions). I'd like to see if we can stop the problems you are having with the connection dropping at all in the mean time.

First of all can you turn on debug logging.

logger:
  logs:
    homeassistant.components.homekit_controller: debug
    aiohomekit: debug

Then can you replace aiohomekit/controller/ip/connection.py with https://gist.github.com/Jc2k/11c2a53ebb1de1810655bb8758dbb132? It will dump the raw HTTP requests and responses into the log so we can see why your connection is sometimes dropping.

(I can't give more specific information on the location of that file without knowing more about how you install HA - hope thats enough of a clue to find it!)

When you've done all that restart HA until you get the log message you reported:

2020-04-01 23:00:56 INFO (MainThread) [aiohomekit.controller.ip.pairing] This device does not support push, so only polling operations will be supported during this session

Then i need the full log from when you started HA to when you got that message.

Thanks!

MartianMH1 commented 4 years ago

Hi @Jc2k

Thank you very much for helping me. I uploaded the Log here in Github and gave you access to the repository. Hope I did it correclty. I removed sensitive information, but I still dont want it public.

Regards

Jc2k commented 4 years ago

Brilliant. So 2 things stand out:

So i'm going to be filtering dupes out for another ticket and I hope that this will fix things for you as well...

MartianMH1 commented 4 years ago

Thanks for analyzing my logs.

Even if the logs say otherwise I wanted to be sure the problem is not the power of the raspberry pi itself. But like you said, it doesn't seem this is the case.

I'm grateful I can at least get it to work by "just" restarting it until async is activated. Thanks for your support and let's Hope :)

marcandre83 commented 4 years ago

Yes you were right, this is after I completely rebooted my Pi, the sensor was not working:

2020-05-05 22:00:14 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Secure connection to 192.168.1.17:4567 established
2020-05-05 22:00:14 INFO (MainThread) [aiohomekit.controller.ip.pairing] This device does not support push, so only polling operations will be supported during this session
2020-05-05 22:00:14 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'GET /characteristics?id=1.65795,1.65796,1.65797,1.65798,1.66307,1.66308,2.131075 HTTP/1.1\r\nHost: 192.168.1.17\r\n\r\n'
2020-05-05 22:00:14 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw response: bytearray(b'{"characteristics":[{"aid":1,"iid":65795,"value":false},{"aid":1,"iid":65796,"value":0},{"aid":1,"iid":65797,"value":0},{"aid":1,"iid":65798,"value":100},{"aid":1,"iid":66307,"value":3},{"aid":1,"iid":66308,"value":3},{"aid":2,"iid":131075,"value":1}]}')

then I restarted Home Assistant and the sensor was working, this should be the same section of that HA boot:

2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Secure connection to 192.168.1.17:4567 established
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 288\r\n\r\n{"characteristics": [{"aid": 1, "iid": 65795, "ev": true}, {"aid": 1, "iid": 66307, "ev": true}, {"aid": 1, "iid": 66308, "ev": true}, {"aid": 1, "iid": 65796, "ev": true}, {"aid": 1, "iid": 65797, "ev": true}, {"aid": 1, "iid": 65798, "ev": true}, {"aid": 2, "iid": 131075, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw response: bytearray(b'')
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 173\r\n\r\n{"characteristics": [{"aid": 1, "iid": 65795, "ev": true}, {"aid": 1, "iid": 65796, "ev": true}, {"aid": 1, "iid": 65797, "ev": true}, {"aid": 1, "iid": 65798, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 97\r\n\r\n{"characteristics": [{"aid": 1, "iid": 66307, "ev": true}, {"aid": 1, "iid": 66308, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 60\r\n\r\n{"characteristics": [{"aid": 2, "iid": 131075, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 288\r\n\r\n{"characteristics": [{"aid": 1, "iid": 65795, "ev": true}, {"aid": 1, "iid": 65796, "ev": true}, {"aid": 1, "iid": 65797, "ev": true}, {"aid": 1, "iid": 65798, "ev": true}, {"aid": 1, "iid": 66307, "ev": true}, {"aid": 1, "iid": 66308, "ev": true}, {"aid": 2, "iid": 131075, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw response: bytearray(b'')
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Connection HomeKitConnection(host='192.168.1.17', port=4567) lost.
2020-05-05 22:00:13 DEBUG (MainThread) [homeassistant.components.homekit_controller.connection] Starting HomeKit controller update
Jc2k commented 4 years ago

So you should see a flurry of PUT messages like that for the case where it does work AND in the case where it doesn't. In the case where it doesn't it will retry and you will see the "the device does not support push" in additiion to those PUTs. I expect to see a bunch of PUT logs followed by

2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Connection HomeKitConnection(host='192.168.1.17', port=4567) lost.

in the case where it fails.

In the case where it worked i'd expect it to start polling immediately instead.

In your second example there are 5 PUT requests against /characteristics. What i want to compare is a time it works to a time it doesn't. What was the difference? I think theres a race that means sometimes that same data gets sent in a different way. What is it that makes some variants ok and some bad? Does it not send duplicates sometimes? Does it send smaller packets sometimes?

marcandre83 commented 4 years ago

I am using the sensor to trigger my doorbell, so it is quite important for me to be stable. Is there any possibility to temporarily change some things to the integration and remove the workaround for the tado devices, so homekit controller always activates async for the sensor? I don't use any tado devices with Home Assistant, so that doesn't matter.

EDIT: let me reconstruct both cases and give you more log details, if that helps you?

Jc2k commented 4 years ago

More logs would be great, thanks :-)

I guess the workaround is triggering because async is unstable for aqara as well, the difference is without the workaround it might eventually get a connection through retrying. I don't know for sure that it will help, though Aqara seems stable for some HA users so it's worth a shot.

This is the workaround in full:

https://github.com/Jc2k/aiohomekit/commit/3ced7af3b3f09a568c1ae56819b3eba0b512dea4#diff-435ec871bcd9f6dbe68965dedee5374f

If you delete those lines from your installation it will hopefully retrying until turning on events works. Put lets see if your logs reveal something obvious - there might be a proper fix you can test instead.

marcandre83 commented 4 years ago

Thank you very much :). I'd be really glad to help sorting this out for everyone.

Here are some more PUT messages of a boot where async failed:

2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Secure connection to 192.168.1.17:4567 established
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 288\r\n\r\n{"characteristics": [{"aid": 1, "iid": 65795, "ev": true}, {"aid": 1, "iid": 66307, "ev": true}, {"aid": 1, "iid": 66308, "ev": true}, {"aid": 1, "iid": 65796, "ev": true}, {"aid": 1, "iid": 65797, "ev": true}, {"aid": 1, "iid": 65798, "ev": true}, {"aid": 2, "iid": 131075, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw response: bytearray(b'')
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 173\r\n\r\n{"characteristics": [{"aid": 1, "iid": 65795, "ev": true}, {"aid": 1, "iid": 65796, "ev": true}, {"aid": 1, "iid": 65797, "ev": true}, {"aid": 1, "iid": 65798, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 97\r\n\r\n{"characteristics": [{"aid": 1, "iid": 66307, "ev": true}, {"aid": 1, "iid": 66308, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 60\r\n\r\n{"characteristics": [{"aid": 2, "iid": 131075, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw request: b'PUT /characteristics HTTP/1.1\r\nHost: 192.168.1.17\r\nContent-Type: application/hap+json\r\nContent-Length: 288\r\n\r\n{"characteristics": [{"aid": 1, "iid": 65795, "ev": true}, {"aid": 1, "iid": 65796, "ev": true}, {"aid": 1, "iid": 65797, "ev": true}, {"aid": 1, "iid": 65798, "ev": true}, {"aid": 1, "iid": 66307, "ev": true}, {"aid": 1, "iid": 66308, "ev": true}, {"aid": 2, "iid": 131075, "ev": true}]}'
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] 192.168.1.17: raw response: bytearray(b'')
2020-05-05 22:00:13 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Connection HomeKitConnection(host='192.168.1.17', port=4567) lost.
2020-05-05 22:00:13 DEBUG (MainThread) [homeassistant.components.homekit_controller.connection] Starting HomeKit controller update

I can also send you the complete log as I have seen some more content containing the IP of my hub (nothing with PUT messages), but I prefer to do that in a pm since there is some private content in it.

I can also remove the tado workaround lines from the code and do some testing if that helps you.

Jc2k commented 4 years ago

The HA discord or forums are probably the easiest place to DM me.

I want to see if I can get a theory for the root cause without taking out tado workaround as possible, so complete log probably first avenue. But if i can't it might be an interesting data point to throw into the mix.

marcandre83 commented 4 years ago

ok great, I just sent you a DM on the HA discord.

MartianMH1 commented 4 years ago

@marcandre83 A little tip in case it's not working with the workaround deleted either. At the moment when I restart HA I just turn on the light on the Aqara-Hub via the App and check if the change is transfered to HA or not. Much easier than walk to the door check it there.

marcandre83 commented 4 years ago

@jc2k fixed it! seems like it is working like a charm now, i have no problems and the state changes get reflected very quick. here's what he has done to the code to make it work:

homeassistant/components/homekit_controller/connection.py changes:

around line 128 replace

    def add_watchable_characteristics(self, characteristics):
        """Add (aid, iid) pairs that we need to poll."""
        self.watchable_characteristics.extend(characteristics)
        self.hass.async_create_task(self.pairing.subscribe(characteristics))

with

    async def _subscribe_helper(self, characteristics):
        async with self.pairing_lock:
            await self.pairing.subscribe(characteristics)

    def add_watchable_characteristics(self, characteristics):
        """Add (aid, iid) pairs that we need to poll."""
        self.watchable_characteristics.extend(characteristics)
        self.hass.async_create_task(self._subscribe_helper(characteristics))

around line 187 replace

            await self.pairing.subscribe(self.watchable_characteristics)

with

            await self._subscribe_helper(self.watchable_characteristics)

around line 212 replace

    async def async_refresh_entity_map(self, config_num):
        """Handle setup of a HomeKit accessory."""
        try:
            self.accessories = await self.pairing.list_accessories_and_characte$
        except AccessoryDisconnectedError:
            # If we fail to refresh this data then we will naturally retry
            # later when Bonjour spots c# is still not up to date.
            return False

with

    async def async_refresh_entity_map(self, config_num):
        """Handle setup of a HomeKit accessory."""
        async with self.pairing_lock:
            try:
                self.accessories = await self.pairing.list_accessories_and_characteristics()
            except AccessoryDisconnectedError:
                # If we fail to refresh this data then we will naturally retry
                # later when Bonjour spots c# is still not up to date.
                return False
Jc2k commented 4 years ago

Thanks for writing this up @marcandre83! I have a patch in aiohomekit for this which i'm going to try and get into 0.110.x if im not too late.

Jc2k commented 4 years ago

0.110.2 should have this fix - it would be great if you could test it works for you @marcandre83 / @MartianMH1.

l4ci commented 4 years ago

Working for me like a charm now 🥳 thanks for your hard work!

MartianMH1 commented 4 years ago

Seems to work now, Thank you very much for the effort!

Jc2k commented 4 years ago

Good to hear everyone! Thanks for letting me know :-)