home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.35k stars 29.89k forks source link

Devices discovered by zeroconf list link-local address as first ip, homekit_controller can't connect to them #54500

Closed OmegaKill closed 2 years ago

OmegaKill commented 3 years ago

The problem

Bosch Smart Home doesn't work after upgrade from 6.5 to 8.6

What is version of Home Assistant Core has the issue?

2021.8.6

What was the last working version of Home Assistant Core?

6.5

What type of installation are you running?

Home Assistant OS

Integration causing the issue

HomeKit Controller

Link to integration documentation on our website

https://www.home-assistant.io/integrations/homekit_controller

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Only shows up that the climate entity for the automation is missing

Additional information

All devices from the HomeKit Integration for Bosch Smart Home Controller are all unavailable

probot-home-assistant[bot] commented 3 years ago

homekit_controller documentation homekit_controller source (message by IssueLinks)

probot-home-assistant[bot] commented 3 years ago

Hey there @jc2k, @bdraco, mind taking a look at this issue as it has been labeled with an integration (homekit_controller) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

Jc2k commented 3 years ago

I have a climate entity (not Bosch) that is working on both those releases. So it looks like its going to be Bosch specific or some other environmental cause.

Can you attach your .storage/homekit_controller-entity-map

Can you turn on verbose logging for homekit_controller with these settings and then attach some logs.

logger:
  logs:
    aiohomekit: debug
    homeassistant.components.homekit_controller: debug

Finally can you attach the output of python3 -m netdisco dump.

OmegaKill commented 3 years ago

In the ZIP FIles is the homekit_controller-entity-map and the system Log. Do you need any other log than that? In the "normal" log section appeared no new entrys than before.

If i put python3 -m netdisco dump in the terminal it tells me there is no python3. Do i need to use a different terminal than root?

Files.zip

Jc2k commented 3 years ago

The system logs are the logs I needed, thank you.

Sometimes the command is python3 -m netdisco dump. Sometimes its python -m netdisco dump. Depends on the environment. Sorry about that.

It looks like your thermostat is advertising its address on the 169.254 range, so homekit_controller can't connect to it:

2021-08-18 10:17:41 DEBUG (MainThread) [root] Located Homekit IP accessory {'name': 'Bosch SHC 038ca8._hap._tcp.local.', 'address': '169.254.142.92', 'port': 45090, 'c#': '3', 'id': '1D:5E:66:D3:4F:1F', 'md': 'SMART_HOME_CONTROLLER', 's#': '14', 'ci': '2', 'sf': '0', 'ff': 2, 'flags': <FeatureFlags.SUPPORTS_SOFTWARE_AUTHENTICATION: 2>, 'pv': '1.1', 'statusflags': 'Accessory has been paired.', 'category': 'Bridge'}
2021-08-18 10:17:41 DEBUG (MainThread) [aiohomekit.controller.ip.connection] Attempting connection to 169.254.142.92:45090

There are a couple of reasons this can happen, sometimes the devices fault, sometimes the networks fault. Power cycling the thermostat (or whichever bit is connected to your wifi) should fix that.

If it doesn't, can you add zeroconf to the logs and lets try again:

logger:
  logs:
    aiohomekit: debug
    homeassistant.components.homekit_controller: debug
    zeroconf: debug

Cheers

OmegaKill commented 3 years ago

The python command still wont do anything. I forgot to add that not only the thermostat is unavailable but all Bosch devices like light switches, smoke detectors ans shutter controls. The thermostat doesn't connect to a wifi directly it connects to the Bosch Smart Home Controller and the controller is connected via ethernet.

The IP of the controller is 10.0.1.2

Log_zeroconf.txt

Jc2k commented 3 years ago

Can you confirm you have power cycled the bridge device?

OmegaKill commented 3 years ago

I power cycled both the bridge device and the raspberry pi where homeassistant is running on Log_zeroconf_after_power_cycle.txt

Jc2k commented 3 years ago
2021-08-18 11:36:26 DEBUG (MainThread) [zeroconf] Received from '10.0.1.2':5353 [socket 28 (('10.0.1.1', 5353))]: <DNSIncoming:{id=0, flags=33792, truncated=False, n_q=0, n_ans=2, n_auth=0, n_add=7, questions=[], answers=[record[ptr,in,_http._tcp.local.]=4500/4499,Bosch SHC [64-da-a0-03-8c-a8]._http._tcp.local., record[ptr,in,_hap._tcp.local.]=4500/4499,Bosch SHC 038ca8._hap._tcp.local., record[srv,in-unique,Bosch SHC [64-da-a0-03-8c-a8]._http._tcp.local.]=120/119,shc038ca8.local.:8443, record[txt,in-unique,Bosch SHC [64-da-a0-03-8c-a8]._http._tcp.local.]=4500/4499,b'\x00', record[srv,in-unique,Bosch SHC 038ca8._hap._tcp.local.]=120/119,shc038ca8.local.:45090, record[txt,in-unique,Bosch SHC 038ca8._hap._tcp.local.]=4500/4499,b'\x04c#=3\x04f'..., record[a,in-unique,shc038ca8.local.]=120/119,169.254.142.92, record[a,in-unique,shc038ca8.local.]=120/119,10.0.1.2, record[quada,in-unique,shc038ca8.local.]=120/119,fe80::66da:a0ff:fe03:8ca8]}> (334 bytes) as [b'\x00\x00\x84\x00\x00\x00\x00\x02\x00\x00\x00\x07\x05_http\x04_tcp\x05local\x00\x00\x0c\x00\x01\x00\x00\x11\x94\x00 \x1dBosch SHC [64-da-a0-03-8c-a8]\xc0\x0c\x04_hap\xc0\x12\x00\x0c\x00\x01\x00\x00\x11\x94\x00\x13\x10Bosch SHC 038ca8\xc0H\xc0(\x00!\x80\x01\x00\x00\x00x\x00\x12\x00\x00\x00\x00 \xfb\tshc038ca8\xc0\x17\xc0(\x00\x10\x80\x01\x00\x00\x11\x94\x00\x01\x00\xc0Y\x00!\x80\x01\x00\x00\x00x\x00\x08\x00\x00\x00\x00\xb0"\xc0~\xc0Y\x00\x10\x80\x01\x00\x00\x11\x94\x00[\x04c#=3\x04ff=2\x14id=1D:5E:66:D3:4F:1F\x18md=SMART_HOME_CONTROLLER\x06pv=1.1\x05s#=16\x04sf=0\x04ci=2\x0bsh=I9FzMQ==\xc0~\x00\x01\x80\x01\x00\x00\x00x\x00\x04\xa9\xfe\x8e\\\xc0~\x00\x01\x80\x01\x00\x00\x00x\x00\x04\n\x00\x01\x02\xc0~\x00\x1c\x80\x01\x00\x00\x00x\x00\x10\xfe\x80\x00\x00\x00\x00\x00\x00f\xda\xa0\xff\xfe\x03\x8c\xa8']

For some reason it looks like your device is advertising 169.254.142.92 as the preferred IP to access the device, 10.0.1.2 is the 2nd choice.

I have seen that behaviour before, and when a power cycle didn't fix it it turned out to be a bridge with multiple interfaces (ethernet and wifi). The user was able to disable the interface they weren't using, so that the bridge never gave itself a 169.254 link local address in the first place. But they had quite low level access to that bridge.

The proper fix in HA is to try all the advertisised addresses, not just the first one. Ideally in parallel.

OmegaKill commented 3 years ago

This bridge sadly doesn't have a wifi adpater. And also there seems to be no way to configure anything on the bosch controller as far as i know.

Jc2k commented 3 years ago

What is confusing is why it ever worked. It's possible that during the upgrade from June to August we noticed our IP for the device was "wrong" and updated it, but we've only ever tried to use the first IP. Which strongly suggests at somepoint the bridge was returning a working address as its first address.

Does the bridge have anyway to configure a static ip? (An actual static IP and not a static DHCP lease). That's the only work around I can think of, if it isn't already using a static ip.

I'm not sure i'm going to have time to get this fixed before the September release. And my availability past October is really limited, so i'm nervous about making a large invasive change when i'm not going to be around to pick up any tickets.

@bdraco do you have any ideas on this one?

Can you think of anything else?

OmegaKill commented 3 years ago

There seems to be no way to set a static IP directly. The app has no setting for it and there is no webinterface either. Is there anything you need from me say other logs? If not i would revert the HomeAssistant version to 6.5 so it works again. If the problem currently only affects me you don't need to stress yourself. I can life with it.

Jc2k commented 3 years ago

I think that zeroconf log is all we need cheers!

OmegaKill commented 3 years ago

ok cheers

bdraco commented 3 years ago
  • Try all IP's in zeroconf - quite invasive change, limited time to do it myself. This is presumably what iOS does. We'd want a create_connection wrapper that took a list of ips, connected to them all, waited with asyncio.wait for the first one to connect, then cancel the other pending connections. Although asyncio.wait seems to be deprecated.

That's similar to what asyncio.create_connection does when its given a name that it does the resolution for which results in multi address returns.

You could probably use: https://github.com/python/cpython/blob/3.8/Lib/asyncio/staggered.py

Which was built for the happy eyeballs delay: https://github.com/python/cpython/blob/3.9/Lib/asyncio/base_events.py#L1047

OmegaKill commented 2 years ago

Is there an update to this problem?

Jc2k commented 2 years ago

From my point of view, as discussed previously, I'm not available at the moment for big changes. I don't have the time unfortunately. Especially given the risk of further regressions in this area it would be wise not to rush it. In 4-5 weeks I might have some idea of what my capacity will be like for the next few months.

OmegaKill commented 2 years ago

ok i understand

bdraco commented 2 years ago

We should probably add a helper to the network integration to order the IPs by most likely to work based on default network and prefer ones on the same subnet.

Even if it takes a while to get multi ip support at least we have a better chance of connecting to the right one (we would need a prefer ipv4 flag to make sure we don't make it a worse chance though)