home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
74k stars 31.05k forks source link

KNX connection breaking after upgrade to 2021.2 #46150

Closed splunkenizer closed 3 years ago

splunkenizer commented 3 years ago

The problem

I’m facing a serious issue with KNX after upgrading to the 2021.2 release. The connection to the KNX bus is always breaking down after some time. If I restart HA, it does works again, but after around 30-60 minutes KNX devices stop working. I had to roll back to 2021.1.5 where it does work again.

What is version of Home Assistant Core has the issue?

2021.2.1

What was the last working version of Home Assistant Core?

2021.1.5

What type of installation are you running?

Home Assistant OS (installed in a Proxmox VM on Intel NUC)

Integration causing the issue

KNX

Link to integration documentation on our website

https://www.home-assistant.io/integrations/knx/

Example YAML snippet

IPs masked out...

knx:
  tunneling:
    host: '192.168.x.y'
    port: 3671
    local_ip: '192.168.x.z'

Anything in the logs that might be useful for us?

2021-02-06 14:58:58 WARNING (MainThread) [xknx.log] Tunnel connection closed.
2021-02-06 14:59:00 WARNING (MainThread) [xknx.log] Received DisconnectRequest from tunnelling sever.
2021-02-06 14:59:00 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback _SelectorDatagramTransport._read_ready()
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/asyncio/events.py", line 81, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.8/asyncio/selector_events.py", line 1021, in _read_ready
    self._protocol.datagram_received(data, addr)
  File "/usr/local/lib/python3.8/site-packages/xknx/io/udp_client.py", line 70, in datagram_received
    self.data_received_callback(data)
  File "/usr/local/lib/python3.8/site-packages/xknx/io/udp_client.py", line 115, in data_received_callback
    self.handle_knxipframe(knxipframe)
  File "/usr/local/lib/python3.8/site-packages/xknx/io/udp_client.py", line 122, in handle_knxipframe
    callback.callback(knxipframe, self)
  File "/usr/local/lib/python3.8/site-packages/xknx/io/tunnel.py", line 273, in _request_received
    self._disconnect_request_received(knxipframe.body)
  File "/usr/local/lib/python3.8/site-packages/xknx/io/tunnel.py", line 321, in _disconnect_request_received
    self._tunnel_lost()
  File "/usr/local/lib/python3.8/site-packages/xknx/io/tunnel.py", line 129, in _tunnel_lost
    raise CommunicationError("Tunnel connection closed.")
xknx.exceptions.exception.CommunicationError: Tunnel connection closed.
2021-02-06 14:59:00 WARNING (MainThread) [xknx.log] Error: KNX bus did not respond in time (2.0 secs) to GroupValueRead request for: 1/2/25

I did have the problem of disconnecting the KNX bus from time to time, because I use the KNX IP gateway with another device. But I moved HA to an exclusive gateway now and it still does not work with 2021.2.1 and breaks after a very short time.

probot-home-assistant[bot] commented 3 years ago

knx documentation knx source (message by IssueLinks)

mkliche commented 3 years ago

I have exactly the same error in my log file. It appears there exactly every 12 hours, around 2:05 p.m. and around 2:05 a.m. I suspect it has something to do with a certain KNX telegram, but I don't yet know which exactly is the cause.

splunkenizer commented 3 years ago

@mkliche is the KNX connection lost as well after you got the message?

mkliche commented 3 years ago

Yes, KNX connection gets lost as well. Sorry, forgot to mention.

farmio commented 3 years ago

@splunkenizer do you have any errors from knx / xknx in the logs when running 2021.1.x in the same intervals? Something like "Heartbeat to KNX bus failed. Reconnecting."?

@mkliche do you use a dedicated tunneling configuration for knx as well or auto-discovery?

splunkenizer commented 3 years ago

@farmio I did not find messages like heartbeat or similar in the logs. Is the bug #46178 also helping for a dedicated config? I wasn't aware there is an auto discovery config available. I have changed it to that one now, which is working as well on my 2021.1.5.

farmio commented 3 years ago

Im afraid not. In dedicated tunneling connections auto_reconnect is already on. This is what I'm not understanding about your bug here. Auto discovery is just not having tunneling: or routing: in the configuration.

Would it be possible for you to turn logging for xknx to debug on 2021.2 until the connection dies and send me these logs? I can't reproduce this bug on my system.

farmio commented 3 years ago

What else do you have configured, besides tunneling:? Are you using

knx: 
  config_file: 

in your configuration?

splunkenizer commented 3 years ago

@farmio Yes, I'm using this option in my settings. It looks like this:

knx:
  config_file: 'xknx.yaml'
  light: !include knx_lights.yaml
  cover: !include knx_covers.yaml
  climate: !include knx_climates.yaml
  switch: !include knx_switches.yaml
  sensor: !include knx_sensors.yaml
  scene: !include knx_scenes.yaml
  binary_sensor: !include knx_binary_sensors.yaml
  weather: !include knx_weather.yaml

and in xknx.yaml there is the tunneling setting (I masked out the IPs here).

tunneling:
  host: '192.168.x.y'
  port: 3671
  local_ip: '192.168.x.z'

'host' is the IP of my tunneling device. At 'local_ip' I specify the IP of my HA instance.

farmio commented 3 years ago

Great! Bug will be fixed in next xknx version https://github.com/XKNX/xknx/pull/600 In the meantime please move the tunneling configuration to the HA config file and remove xknx.yaml. See https://www.home-assistant.io/integrations/knx/#tunneling

You should nevertheless find out why your IP interface sends a DisconnectRequest.

And also, nobody is going to steal your private IP address space ;)

splunkenizer commented 3 years ago

The config without config_file should work on 2021.2 then? I guess the tunneling device is sending disconnect, because I have another IP device talking to the same one, but my KNX IP gateway can only handle one channel. But I switched HA to another IP gateway, which is exclusive for HA, but the issue was the same.

splunkenizer commented 3 years ago

I checked that config_file documentation and saw, that the config I had in my xknx.yaml was wrong. It's does not have the same sections as specified in the knx: section in configuration.yaml. I did understand this wrong and thought it's just to store the same config in an extra file. So, I assume xknx.yml, was ignored and I was on auto_detect always.

farmio commented 3 years ago

Possible. I'm not entirely sure how this behaves with wrong configurations 🙃 But the point still stands: don't use config_file: at all - then your entire configuration is also Schema-checked.

mkliche commented 3 years ago

My config in knx.yaml looks similar to the one of @splunkenizer, but my xknx.yaml looks different:

routing:
  local_ip: '192.168.178.10'
farmio commented 3 years ago

@mkliche this is also a non-working configuration then and same suggestion applies.

splunkenizer commented 3 years ago

I have upgraded my installation again to 2021.2.1 and updated the config to not use the config_file setting. The KNX connection was never lost in the last couple of hours, even if I got some DisconnectRequest requests from the tunnelling server. I will check with the other IP gateway as well. I'm very happy, we found the issue with my configuration and also solved a bug at the same time. Thanks everyone for the great support. This is just an awesome project and community. Keep up the great work.

mkliche commented 3 years ago

Okay, I think I understand the problem now. Don't know if it belongs here, but there is still another problem in my environment which seems related to this one: Only the tunneling connection is working in my environment from Home Assistant, but not the multicast connection.

In auto-discover-mode I got something like this in my log files:

2021-02-09 17:45:33 DEBUG (MainThread) [xknx.knx] Sending: <KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="SEARCH_REQUEST" Reserve="0" TotalLength="14" />
 body="<SearchRequest discovery_endpoint="<HPAI 224.0.23.12:3671 />" />" />
2021-02-09 17:45:33 INFO (MainThread) [xknx.log] Could not connect to an KNX/IP device on vethb09dbb3
2021-02-09 17:45:33 INFO (MainThread) [xknx.log] Could not connect to an KNX/IP device on veth9521bf8
2021-02-09 17:45:33 INFO (MainThread) [xknx.log] Could not connect to an KNX/IP device on vethf59d4f5
2021-02-09 17:45:33 INFO (MainThread) [xknx.log] Could not connect to an KNX/IP device on veth5b23ebd
2021-02-09 17:45:34 DEBUG (MainThread) [xknx.log] Using interface: enp3s0
2021-02-09 17:45:34 DEBUG (MainThread) [xknx.log] Starting tunnel from 192.168.178.10 to 192.168.178.11:3671

It's trying to open a multicast connection, that seems not to work, so it falls back to a tunneling connection. If I set a dedicated routing (multicast) connection in my config like this

routing:
  local_ip: "192.168.178.10"

it does not work at all.

So why isn't it possible to establish a multicast connection from Home Assistant (in a Docker container, host network)? From ETS on my Windows machine the multicast connection works like a charme.

farmio commented 3 years ago

It's not trying to open a routing connection. Tunneling is the default. Your previous config was invalid, so it fell back to auto discovery too -> tunneling.

Auto-discovery sends a multicast packet (SearchRequest) which is answered in Unicast. This seems to work, so I guess only receiving multicast is broken.

Did you set a valid IA for xknx and configure a dummy application in ETS for it?

mkliche commented 3 years ago

Did you set a valid IA for xknx and configure a dummy application in ETS for it?

No, I didn't even know that this may be needed. I just tried after your post, but it didn't work either.

I got this in my log, but the connection to KNX did not work, sensors did not update etc.:

2021-02-09 21:02:15 INFO (MainThread) [xknx.log] XKNX v0.16.2 starting routing connection to KNX bus.
2021-02-09 21:02:15 DEBUG (MainThread) [xknx.log] Starting Routing from 192.168.178.10 as 1.1.240`

I even don't know anything about this dummy application in ETS, I guess I should learn something about KNX routing first ;-) In the meantime I will go for tunneling, looking forward to get rid of the disconnect problems in v2021.2.2.

farmio commented 3 years ago

I guess I should learn something about KNX routing first ;-)

👍

I think this issue can be closed!?

splunkenizer commented 3 years ago

My environment is stable since I corrected the config. Auto reconnect for discovery is also rolled out with 2021.2.2.