Closed arcano81 closed 2 years ago
@josiasmontag Don't know what causes this, but every Frame is listed twice in your capture. It's most probably some capturing configuration issue so I'll ignore this 🤷♂️
The first disconnect is different, the other 2 seem also to be caused by a blocked loop. This picture is showing the 2nd disconnect.
You can see the Telegram 33 was ACKed in 0.000696 seconds. Telegram 34 didn't get an ACK in time (~1 second) and is repeated by the KNX interface. After an additional second the KNX interface sends a DisconnectRequest - as it should. After additional 0.2 seconds xknx is unblocked again and works its queue - it ACKs both Telegrams and answers the DisconnectRequest (which both is pointless since we don't have a connection anymore anyway).
What happened between 33 and 34 remains a mystery 🤷
I see the same problem - interestingly only with one device (kitchen window constantly switching from off to unavailable and back).
Just captured a disconnect with the KNX-only HA instance. It looks different:
You could try to checkout and run https://github.com/home-assistant/core/pull/63766 and see if it logs anything. This should find integrations that block the event loop by using time.sleep
(my assumption is that some integrations lib does this from a discovery flow or something like that - I could be very wrong though).
@josiasmontag Yes it looks like the first disconnect of the other capture. I have no idea why this occurs.
I'd try to find out why you see all packets twice.
https://github.com/XKNX/xknx/pull/847 would also be worth a try - its a more or less experimental version of xknx which puts its connection (acking etc.) logic in a separate thread. It works fine on my system (but then again the current release version also works fine here 🤷).
XKNX/xknx#847 would also be worth a try - its a more or less experimental version of xknx which puts its connection (acking etc.) logic in a separate thread. It works fine on my system (but then again the current release version also works fine here 🤷).
This is a great news! I would like to try it on my test and production system. @farmio Can you explain me how to install the experimental version?
@arcano81 see https://xknx.io/home_assistant.html#running-ha-with-local-xknx-library It should be enough to git clone, checkout the PR and set the environment variable.
It will probably only work with a venv installation - Container will probably be more complicated and HaOS I have no idea.
dev
. It's easier to just try current HA dev branch and see if that logs any warnings.See https://developers.home-assistant.io/docs/development_environment
Small update for the experimental xknx version: We decided to ship this, so we disabled threading by default again. If you want to try manually set threaded=True
in your local clone here https://github.com/XKNX/xknx/blob/main/xknx/io/connection.py
I switched my production installation to dev and updated the Core and the Supervisor. The system is up and running since 10 hours and received 7 Disconnect Request from tunneling server but no warning for usage of time.sleep blocking the main event loop in the register.
Hm... it is not included in 20220111 - was merged after that build - it should be in todays build though.
That said there are other blocking calls than time.sleep
which are not detected by this - it's just picking low-hanging fruit.
Hello all,
since I upgraded my NAS with RAM, the KNX interface is running stable.
-> my NAS no longer needs virtual memory -> when taking a snapshot from my VM I still lose the connection -> If you have a NAS, do not use the shared memory.
Stable with HA 2021.12.9 and 2021.12.10
My problem disappeared when commenting our DWD app. (They were blocking the event loop, causing disconnects which then made some entities to be flaky.)
So a new beta has started yesterday. Please try it and see if the connection problems with KNX disappear - or if there are any log messages regarding "timer_out_of_sync" (homeassistant.core debug) or similar.
Great news !Thankyou. How can I get this beta version for a try?
That depends on your installation method. A quick search at Google or HA community forum will tell you.
I updated my production instance of HA to last beta versione (2022.2.0b3). The problem with knx disconnection became even worse than before.With this beta version diconnections happens every 2/3 seconds. I attach here the log from HA. Log KNX beta version (2022.2.0b3).log
@arcano81: Can you please add the KNX diagnostics to the issue? You can download it on the integrations page after clicking on the three dots on the KNX integration.
Also, please send another log with
debugpy:
active in your configuration.yaml
and add debug logging also to homeassistant.core
please.
@arcano81 thanks for these logs, we found a bug in the route_back logic. Try disabling route_back - you probably don't need it as your HA device and interface are on the same network anyway.
@farmio Disabling the route back solved the problem of disconnection every 2/3 seconds. Still present the usual behaviour with random diconnections. @marvin-w I attach here the KNX diagnostics config_entry-knx-e0f52016e11797923387412866023651.zip and here the HA log with debugpy and homeassistant.core. (The logs start from bootstrap of HA and ends 10 minutes after KNX tunneling disconnection). You can find a disconnection @ 2022-01-30 22:44:46 Ha Log (2022.2.0b3).zip
Last questions: How do I use the individual address in KNX configuration?Which address do I have to use? Thanx
@arcano81
How do I use the individual address in KNX configuration?Which address do I have to use?
You don't need to configure any address here. A tunnelling server should assign an address for a client at connection. Some (rare) don't, yours does: "current_address": "1.0.6"
If you would be using routing you'd need to set it.
What is this entity: entity_id=sensor.ntc_temperature_01
which integration does trigger it and do you have any automations or anything using it? I don't know if this is coincidence, but it changes state near every timer_out_of_sync
event or DisconnectRequest.
It doesn't seem debugpy
is activated in these logs. It is not a logger but an integration, you'd need to add debugpy:
to your configuration.yaml - see https://www.home-assistant.io/integrations/debugpy/
The entity_id=sensor.ntc_temperature_01
comes frome ESPHome integration.
It's a temperature sensor to acquire data from the hot water boiler. It's not used in any automation, just for showing up its value in the dashboard.
I correctly added debugpy:
in configuration.yaml and restarted...
I'll let you have some new logs tomorrow !
Thankyou
@farmio
2022-01-31 00:29:39 WARNING (MainThread) [homeassistant.util.async_] Detected blocking call inside the event loop. This is causing stability issues. Please report issue for debugpy doing blocking calls at homeassistant/components/debugpy/__init__.py, line 49: debugpy.listen((conf[CONF_HOST], conf[CONF_PORT]))
The logs is showing this line and HA has become instable, very slow at bootstrap.
Try to disable debugpy:
and see what happens.
This has just been fixed in beta 4 (#65252). But I think it's expected that HA could run noticeably slower when the debugger is attached.
Upgraded to HA 2022.2.0b4 Now the system boots correctly. It's a bit slower but everything is up and running. The KNX disconnection now happens again every 10 seconds even if rout back is disabled. Here you can find logs from HA and diagnostic form KNX integration. config_entry-knx-e0f52016e11797923387412866023651.json.zip home-assistant (2022.2.0b4)+debugpy.zip
I will roll back to HA 2022.2.0b3 which has a tolerable behaviour with KNX disconnection while waiting for a fix. Thanx
@arcano81 to me it seems your HA instance is just a tick too big for the Raspberry, but I could be totally wrong there. There are just so many timeout occasions and asyncio warnings on different parts of the system.
I'd like to try to enable threading for xknx on your HA to see if it fixes at least knx, if you are interested feel free to ping me at Discord farmio#5918
(or at xknx discord)
Dear Folks,
I have the same issue.
My environment:
HA 32bit Core Version: 2021.12.10 on Raspi 4 4 GB LAN Connection MDT SCN-IP000.03 KNX IP Interface FW Version 3.03. Application Programm Vers. 2.1
Till yesterday this error occurs nearly every 5 -6 minutes After farmio gives me the hint I investigated for a couple of hours.
I Found the following issues:
Node-Red I use it as workflow engine in homeassistant.
Normally I use the ultimate knx plugin which creates on own knx connection next to homeassistant. Unfortunately, you can not set a tunneling group address as well as for xknx
xknx.exceptions.exception.CouldNotParseTelegram: <CouldNotParseTelegram description="payload invalid" destination_address="3/1/5" device_name="LED WHZ Voute Außen" feature_name="State" payload="<GroupValueWrite value="<DPTArray value="[0x3]" />" />" source_address="1.1.12"/>
After that happens the connection got lost.
2022-02-02 22:33:43 WARNING (MainThread) [xknx.log] Received DisconnectRequest from tunnelling sever.
Integrations
I've had a couple of buggy integrations that keep trying to connect to a network endpoint and fail.
2022-02-02 22:33:46 WARNING (MainThread) [homeassistant.components.onvif] Couldn't connect to camera 'test', but will retry later. Error: All connection attempts failed 2022-02-02 22:33:51 ERROR (MainThread) [xknx.log] Unexpected xknx error while processing incoming telegram <Telegram direction="Incoming" source_address="1.1.12" destination_address="3/1/5" payload="<GroupValueWrite value="<DPTArray value="[0x3]" />" />" /> homeassistant.exceptions.HomeAssistantError: Error when calling async_set_default for bulb Yeelight Ceiling10 0x652d1fb at 192.168.1.33: {'code': -5000, 'message': 'general error'}
I fixed that or deactivated the integrations.
For now, it looks much better. The last disconnection was 15 hours ago
--
The problem
After upgrading to HA last version 2021.11.1 the KNX entities continue switching from availble to Not avalable
What version of Home Assistant Core has the issue?
2021.11.1
What was the last working version of Home Assistant Core?
2021.10.9
What type of installation are you running?
Home Assistant OS
Integration causing the issue
KNX
Link to integration documentation on our website
https://www.home-assistant.io/integrations/knx/
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response