home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.89k stars 30.12k forks source link

ELK M1 entities become UNAVAILABLE after Home Assistant Core update #100892

Closed COCottonwood closed 7 months ago

COCottonwood commented 1 year ago

The problem

After updating to a new Home Assistant CORE release, the ELK M1 entities come up as UNAVAILABLE. This causes any automations referencing my ELK M1 entities to fail.

What version of Home Assistant Core has the issue?

core-2023.9.3

What was the last working version of Home Assistant Core?

core-2323.5

What type of installation are you running?

Home Assistant OS

Integration causing the issue

ELK M1

Link to integration documentation on our website

https://www.home-assistant.io/integrations/elkm1#elk-m1xep-version

Diagnostics information

This was first observed around Core-2023.6

I have tried RESTARTING Home Assistant but that doesn't clear it. I have powered OFF and back ON the ELK M1 system and its M1XEP ethernet interface card, but the ELK M1 entities still remain UNAVAILABLE.

The workaround is to REBOOT (restart the system running Home Assistant and all Add-ons) and then communication is restored.

Example YAML snippet

n/a

Anything in the logs that might be useful for us?

home-assistant.log

2023-09-25 20:08:25.556 ERROR (MainThread) [elkm1_lib.connection] Invalid message 'b'D6ZS33333333333333330049\r\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00''
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/elkm1_lib/connection.py", line 103, in _read_stream
    decoded = decode(line)
              ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elkm1_lib/message.py", line 71, in decode
    raise ValueError(error_msg)
ValueError: Incorrect message length, expected D6, got 16. Msg D6ZS33333333333333330049
2023-09-25 20:08:25.599 ERROR (MainThread) [elkm1_lib.connection] Invalid message 'b'1BSD00001Front Door      0028\r\n''
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/elkm1_lib/connection.py", line 103, in _read_stream
    decoded = decode(line)
              ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/elkm1_lib/message.py", line 71, in decode
    raise ValueError(error_msg)
ValueError: Message invalid

Additional information

ELK M1EXP version 2.0.48 on TLS 1.2

System Information

version core-2023.9.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.11.5
os_name Linux
os_version 6.1.21-v8
arch aarch64
timezone America/New_York
config_dir /config
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 10.5 -- | -- update_channel | stable supervisor_version | supervisor-2023.09.2 agent_version | 1.5.1 docker_version | 23.0.6 disk_total | 58.0 GB disk_used | 7.2 GB healthy | true supported | true board | rpi4-64 supervisor_api | ok version_api | ok installed_addons | Samba Backup (5.2.0), Duck DNS (1.15.0), File editor (5.6.0), NGINX Home Assistant SSL proxy (3.5.0), Samba share (10.0.2), Z-Wave JS (0.1.90)
Dashboards dashboards | 6 -- | -- resources | 0 views | 21 mode | storage
Recorder oldest_recorder_run | September 13, 2023 at 2:06 AM -- | -- current_recorder_run | September 25, 2023 at 8:27 PM estimated_db_size | 153.98 MiB database_engine | sqlite database_version | 3.41.2
home-assistant[bot] commented 1 year ago

Hey there @gwww, @bdraco, mind taking a look at this issue as it has been labeled with an integration (elkm1) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `elkm1` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign elkm1` Removes the current integration label and assignees on the issue, add the integration domain after the command.

(message by CodeOwnersMention)


elkm1 documentation elkm1 source (message by IssueLinks)

gwww commented 1 year ago

I will need a debug log. See this post for how to configure: https://community.home-assistant.io/t/elk-m1-interface/4461/983

COCottonwood commented 1 year ago

Its not an issue right now so I am not sure what DEBUG logging will get you today. The problem clears if I REBOOT the entire system. It only happens with a Home Assistant UPDATE, and not every update triggers it. I did edit my issue to add the only logfile entries I found about ELK (see logs entry in the issue section).

I have ENABLED the ELK M1 DEBUG logging so hopefully we can catch it with the upcoming 2023.10 update.

COCottonwood commented 11 months ago

It happened again with today's Update (1 Nov2023) [Core 2023.11.0 / Supervisor 2023.10.1 / OS 11.1 / Frontend 20231030.1 ]. I didn't catch the failure for several hours... and then i re-booted without dumping the logs first, sorry.

Here is what I found in "home-assistant.log.1" file in the config folder... (I censored the Elk IP ):

2023-11-01 21:38:42.285 ERROR (MainThread) [elkm1_lib.connection] Invalid message 'b'D6ZS33333333333333330049\r\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'' Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/elkm1_lib/connection.py", line 103, in _read_stream decoded = decode(line) ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/elkm1_lib/message.py", line 71, in decode raise ValueError(error_msg) ValueError: Incorrect message length, expected D6, got 16. Msg D6ZS33333333333333330049 2023-11-01 21:38:42.479 ERROR (MainThread) [elkm1_lib.connection] Invalid message 'b'1BSD00001Front Door 0028\r\n'' Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/elkm1_lib/connection.py", line 103, in _read_stream decoded = decode(line) ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/elkm1_lib/message.py", line 71, in decode raise ValueError(error_msg) ValueError: Message invalid 2023-11-01 22:30:16.233 WARNING (MainThread) [elkm1_lib.connection] ElkM1 at elksv1_2:// disconnecting 2023-11-01 22:30:34.667 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 1 seconds 2023-11-01 22:30:36.250 WARNING (MainThread) [elkm1_lib.connection] ElkM1 at elksv1_2:// disconnecting 2023-11-01 22:30:38.731 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 2 seconds 2023-11-01 22:30:41.803 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 4 seconds 2023-11-01 22:30:41.805 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 1 seconds 2023-11-01 22:30:44.875 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 2 seconds 2023-11-01 22:30:47.946 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 8 seconds 2023-11-01 22:30:47.948 WARNING (MainThread) [elkm1_lib.connection] Error connecting to ElkM1 ([Errno 113] Connect call failed ('', 2601)). Retrying in 4 seconds

gwww commented 11 months ago

Anything is possible, however I suspect the integration is not broken. Maybe something to do with your network?

I’m guessing a proper log won’t reveal more than what you’ve shared already but if you can post one I’ll take a look at it.

Something else you can test out is use an unsecured connection. It could be something funky with the TLS code.

Maybe something in your setup is off?? Wrong version of some library perhaps. But that seems unlikely since you’re running hassio.

The logs seem clear. A message is received from the Elk that is invalid. Odd to see the load the hex 00 at the end of the message.

It’s also odd that this only happens on an upgrade. Theoretically it’s no different than a reboot. If you can try to replicate other ways that might provide a clue. Maybe try unplugging and plugging in your HA box.

Given there’s hundreds of people running this I think there’s something about your setup that’s different.

COCottonwood commented 11 months ago

Thank you so much for taking the time to provide that advice. I feel your suspicion of the network connection may be in the right direction. How would I get HA and ELK to try to re-establish its LAN connection like it does with a re-boot? Just simply disconnecting the LAN cable from my ELK (how long should I leave it unplugged before reconnecting?)?

My home network is controlled by a Unifi Secure Gateway.

Robert Apodaca, Please excuse my phone's autocorrect nonsense.

-------- Original message -------- From: Glenn Waters @.> Date: 11/1/23 10:29 PM (GMT-07:00) To: home-assistant/core @.> Cc: COCottonwood @.>, Author @.> Subject: Re: [home-assistant/core] ELK M1 entities become UNAVAILABLE after Home Assistant Core update (Issue #100892)

Anything is possible, however I suspect the integration is not broken. Maybe something to do with your network?

I’m guessing a proper log won’t reveal more than what you’ve shared already but if you can post one I’ll take a look at it.

Something else you can test out is use an unsecured connection. It could be something funky with the TLS code.

Maybe something in your setup is off?? Wrong version of some library perhaps. But that seems unlikely since you’re running hassio.

The logs seem clear. A message is received from the Elk that is invalid. Odd to see the load the hex 00 at the end of the message.

It’s also odd that this only happens on an upgrade. Theoretically it’s no different than a reboot. If you can try to replicate other ways that might provide a clue. Maybe try unplugging and plugging in your HA box.

Given there’s hundreds of people running this I think there’s something about your setup that’s different.

— Reply to this email directly, view it on GitHubhttps://github.com/home-assistant/core/issues/100892#issuecomment-1790046392, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWOTKEBRNQSCGLVZHNW3HFTYCMOS7AVCNFSM6AAAAAA5G5CRGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJQGA2DMMZZGI. You are receiving this because you authored the thread.Message ID: @.***>

gwww commented 11 months ago

Reboot your HA box. Settings… System… power button in top right… advanced…. Reboot

Or unplug power to the box and plug in again.

issue-triage-workflows[bot] commented 8 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.