home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.9k stars 30.12k forks source link

ElkM1 integration is repeatedly connecting and disconnecting starting with 2022.3.0b0 #67183

Closed rccoleman closed 2 years ago

rccoleman commented 2 years ago

The problem

After updating to 2022.3.0b0 and continuing with b1, the ElkM1 integration is repeatedly connecting and disconnecting to my 7-year-old M1 Gold panel with M1XEP Ethernet interface. Rolling back to 2022.2.9 immediately fixes the issue and my panel is again fully functional from HA.

What version of Home Assistant Core has the issue?

core-2022.3.0b1

What was the last working version of Home Assistant Core?

2022.2.9

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ElkM1

Link to integration documentation on our website

https://www.home-assistant.io/integrations/elkm1/

Diagnostics information

I don't see a functional diagnostics link on the device page (the one I see says "unavailable"). Here's the other information reported about my panel:

Device info M1 by ELK Products, Inc. Firmware: 5.3.10

Example YAML snippet

The integration was set up in the UI and there's no additional configuration.

Anything in the logs that might be useful for us?

I was asked to provide the logs with debug for the component and the supporting elkm1_lib package, and I'll try to attach the repeating snippet here.

Additional information

No response

probot-home-assistant[bot] commented 2 years ago

elkm1 documentation elkm1 source (message by IssueLinks)

probot-home-assistant[bot] commented 2 years ago

Hey there @gwww, @bdraco, mind taking a look at this issue as it has been labeled with an integration (elkm1) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)

rccoleman commented 2 years ago

elk.log

It seems like the same pattern mostly repeats, but I did a quick diff between two instances and this was the result:

➜  config git:(main) ✗ diff elk-strip.log elk2-strip.log
35c34
<  DEBUG (MainThread) [elkm1_lib.elk] got_data '66LW1101090000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000085'
---
>  DEBUG (MainThread) [elkm1_lib.elk] got_data '66LW1101150000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000088'
563c562
<  DEBUG (MainThread) [elkm1_lib.elk] got_data '16XK2038085240222010006F'
---
>  DEBUG (MainThread) [elkm1_lib.elk] got_data '16XK2149085240222010006C'

I installed v1.1.0 of elkm1_lib and the behavior didn't change. I couldn't go back to 1.0.0 because of the discovery changes in the HA component, so I created a custom_component of the 2022.2.9 version and my Elk integration is working again in 2022.3.0b1.

gwww commented 2 years ago

Thanks for the report and the logs. My first guess is that this is related to the new discovery code that @bdraco added a few weeks ago. I'm looking at the code now.

bdraco commented 2 years ago

@rccoleman Can you also provide a copy of the config entry in .storage/core.config_entries with the username/password removed.

I'm wondering if the new discovery is somehow causing it to reload every time its discovered (should only happen once)

rccoleman commented 2 years ago

Sure. The username and password fields were already blank, BTW.

            {
                "entry_id": "1476c0e8dee74db593dacce749e19a13",
                "version": 1,
                "domain": "elkm1",
                "title": "ElkM1",
                "data": {
                    "host": "elk://192.168.1.229",
                    "area": {
                        "include": [
                            [
                                1,
                                1
                            ]
                        ],
                        "exclude": [],
                        "enabled": true
                    },
                    "zone": {
                        "include": [
                            [
                                1,
                                15
                            ],
                            [
                                17,
                                25
                            ]
                        ],
                        "exclude": [],
                        "enabled": true
                    },
                    "keypad": {
                        "include": [
                            [
                                1,
                                3
                            ]
                        ],
                        "exclude": [],
                        "enabled": true
                    },
                    "counter": {
                        "enabled": false,
                        "exclude": [],
                        "include": []
                    },
                    "output": {
                        "include": [
                            [
                                65,
                                65
                            ]
                        ],
                        "exclude": [],
                        "enabled": true
                    },
                    "task": {
                        "enabled": false,
                        "exclude": [],
                        "include": []
                    },
                    "thermostat": {
                        "enabled": false,
                        "exclude": [],
                        "include": []
                    },
                    "plc": {
                        "enabled": false,
                        "exclude": [],
                        "include": []
                    },
                    "setting": {
                        "enabled": false,
                        "exclude": [],
                        "include": []
                    },
                    "password": "",
                    "prefix": "",
                    "temperature_unit": "\u00b0F",
                    "username": "",
                    "auto_configure": false
                },
                "options": {},
                "pref_disable_new_entities": false,
                "pref_disable_polling": false,
                "source": "import",
                "unique_id": "00:40:9d:8c:86:c8",
                "disabled_by": null
            },
bdraco commented 2 years ago

It looks like discovery is working as expected as the unique_id is populated.

rccoleman commented 2 years ago

I've had it set up via YAML for a few years and this is after adding the 2022.2.9 version as a custom_component and restarting, if that matters. I removed the YAML config after it was imported.

bdraco commented 2 years ago

I've dug through it a bit and a can't find the flaw. I'll make a custom component with additional logging and upload it shortly. Maybe that will help given us an idea what is going on

bdraco commented 2 years ago
mkdir -p /config/custom_components
cd config/custom_components
git clone https://github.com/bdraco/elkm1_custom elkm1
logger:
  default: warning
  logs:
     elkm1_lib: debug
     custom_components.elkm1: debug
rccoleman commented 2 years ago

The additional logs show this:

2022-02-24 10:26:08 DEBUG (MainThread) [elkm1_lib.discovery] discover: ('255.255.255.255', 2362) => b'XEPID'
2022-02-24 10:26:08 DEBUG (MainThread) [elkm1_lib.discovery] discover: ('192.168.1.229', 2362) <= b'M1XEP\x00@\x9d\x8c\x86\xc8\xc0\xa8\x01\xe5\n)M1XEP\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x01'
2022-02-24 10:26:11 DEBUG (MainThread) [custom_components.elkm1.config_flow] Elk discovered from integration discovery: ElkSystem(mac_address='00:40:9D:8C:86:C8', ip_address='192.168.1.229', port=2601)
2022-02-24 10:26:11 DEBUG (MainThread) [custom_components.elkm1.discovery] Unique id is already present from discovery: ElkSystem(mac_address='00:40:9D:8C:86:C8', ip_address='192.168.1.229', port=2601)
2022-02-24 10:26:16 DEBUG (MainThread) [custom_components.elkm1] Timed out waiting for login event

I thought that earlier I was connecting to the non-secure port (2101), but I don't know for sure. Seems to happen consistently, but with some different logs around it. Here's another instance:

2022-02-24 10:27:55 DEBUG (MainThread) [elkm1_lib.elk] got_data '16XK54271052402220100071'
2022-02-24 10:27:57 DEBUG (MainThread) [custom_components.elkm1] Timed out waiting for login event
2022-02-24 10:27:57 DEBUG (MainThread) [elkm1_lib.proto] disconnected callback
bdraco commented 2 years ago

I increased the login timeout to 30s. Please try

cd config/custom_components
git pull

Verify its the latest

git log -1
commit 97cd8af2977b231e3d549843491b40ac7e5ef21f (HEAD -> main, origin/main)
Author: J. Nick Koston <nick@koston.org>
Date:   Thu Feb 24 08:30:53 2022 -1000

Restart

rccoleman commented 2 years ago

Confirmed that I'm using the new version and I see an increased delay, but still seeing the same:

2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.proto] write_data '0Bsd021990052'
2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.elk] got_data '1BSD02199USER 199        008F'
2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.proto] write_data '0Bsd022000063'
2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.elk] got_data '1BSD02000                00A4'
2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.proto] write_data '0Cua0000000037'
2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.elk] got_data '17IC000000000000000030079'
2022-02-24 10:35:05 DEBUG (MainThread) [elkm1_lib.elk] got_data '19UA0000000030309F4941F00C3'
2022-02-24 10:35:14 DEBUG (MainThread) [custom_components.elkm1.config_flow] Elk discovered from dhcp: ElkSystem(mac_address='00409d8c86c8', ip_address='192.168.1.229', port=0)
2022-02-24 10:35:14 DEBUG (MainThread) [custom_components.elkm1.discovery] Unique id is already present from discovery: ElkSystem(mac_address='00409d8c86c8', ip_address='192.168.1.229', port=0)
2022-02-24 10:35:19 INFO (MainThread) [hass_nabucasa.google_report_state] Connected
2022-02-24 10:35:25 DEBUG (MainThread) [elkm1_lib.elk] got_data '16XK24351052402220100075'
2022-02-24 10:35:27 DEBUG (MainThread) [custom_components.elkm1] Timed out waiting for login event
2022-02-24 10:35:27 DEBUG (MainThread) [elkm1_lib.proto] disconnected callback
2022-02-24 10:35:27 WARNING (MainThread) [elkm1_lib.elk] ElkM1 at elk://192.168.1.229 disconnected

Another instance:

2022-02-24 10:36:31 DEBUG (MainThread) [elkm1_lib.elk] got_data '1BSD02000                00A4'
2022-02-24 10:36:31 DEBUG (MainThread) [elkm1_lib.proto] write_data '0Cua0000000037'
2022-02-24 10:36:31 DEBUG (MainThread) [elkm1_lib.elk] got_data '17IC000000000000000030079'
2022-02-24 10:36:31 DEBUG (MainThread) [elkm1_lib.elk] got_data '19UA0000000030309F4941F00C3'
2022-02-24 10:36:43 INFO (MainThread) [homeassistant.components.websocket_api.http.connection] [139850416612784] Connection closed by client
2022-02-24 10:36:55 DEBUG (MainThread) [elkm1_lib.elk] got_data '16XK54361052402220100071'
2022-02-24 10:36:57 DEBUG (MainThread) [custom_components.elkm1] Timed out waiting for login event
2022-02-24 10:36:57 DEBUG (MainThread) [elkm1_lib.proto] disconnected callback
2022-02-24 10:36:57 WARNING (MainThread) [elkm1_lib.elk] ElkM1 at elk://192.168.1.229 disconnected
bdraco commented 2 years ago

So I think the problem is there is no login event because you have no username and password

rccoleman commented 2 years ago

I realized that I hadn't commented out the YAML as I thought I had, so I just did that. Looking through the docs, it looks like username/password are option for non-secure? And I didn't have them in my YAML config. Further, I see this:

Connection string to Elk of the form ://

[:port]. is elk for non-secure connection, elks for secure TLS 1.0 connection, elksv1_2 for secure TLS 1.2 connection, and serial for serial port connection.

Shouldn't it be using the non-secure port?

2022-02-24 10:37:38 INFO (MainThread) [elkm1_lib.elk] Connecting to ElkM1 at elk://192.168.1.229

bdraco commented 2 years ago

elk is non-secure elks is secure

bdraco commented 2 years ago

Please try the latest:

cd config/custom_components
git pull

Verify its the latest

git log -1
commit 4500fa337bda68f6f88d8dd47bca740f6223bcfa (HEAD -> main, origin/main)
Author: J. Nick Koston <nick@koston.org>
Date:   Thu Feb 24 08:47:41 2022 -1000

    Fix elk systems that do not use password auth

Restart

rccoleman commented 2 years ago

Okay, that seems to have fixed it!

bdraco commented 2 years ago

Thanks for testing. We will get this fix in the next beta.