home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.89k stars 30.11k forks source link

Matter Server: All device offline all of a sudden #126136

Open 3oris opened 2 weeks ago

3oris commented 2 weeks ago

The problem

After about 5 days of operation all matter devices become unavailable. The devices are still online in the other (google home) fabric though.

The devices are still pingable from the device info page, and if I do so the specific device gets back online again.

This is not feasible though manually with over 90 matter devices in the system.

Matter devices

Border routers

What version of Home Assistant Core has the issue?

core-2024.9.1

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Matter

Link to integration documentation on our website

No response

Diagnostics information

core_matter_server_2024-09-17T15-42-59.844Z.log matter-c921cb8346a353e6865401775d822fe4-Essentials GU10-80fecbd596935ee1f84171a5c0aac88b.json

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

3oris commented 2 weeks ago

Restarting the matter server fixes it (after some time).

tornenen commented 2 weeks ago

Same problem for me. Using Home Assistant OS and core-2024.9.2

deveylder commented 2 weeks ago

Veryfy your network setting in homeassistant. Mine had changed to something completely different. Setting a static adress solved the isue

tornenen commented 2 weeks ago

Veryfy your network setting in homeassistant. Mine had changed to something completely different. Setting a static adress solved the isue

No, still the same for me.

agners commented 1 week ago

@3oris (and others) when the device go unavailable, does reloading the integration helps? Settings -> Devices & services -> Matter -> Three dot menu -> Reload.

What Home Assistant OS and Matter Server add-on version are you using?

home-assistant[bot] commented 1 week ago

Hey there @home-assistant/matter, mind taking a look at this issue as it has been labeled with an integration (matter) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `matter` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign matter` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


matter documentation matter source (message by IssueLinks)

3oris commented 1 week ago

@3oris (and others) when the device go unavailable, does reloading the integration helps? Settings -> Devices & services -> Matter -> Three dot menu -> Reload.

@agners : will check as soon as it happens again (probably tomorrow or Saturday). Restarting the Add-On does help to say the least.

What Home Assistant OS and Matter Server add-on version are you using?

3oris commented 1 week ago

@agners -- Also, I was wondering if it might be a regression in 6.5.1 https://github.com/home-assistant-libs/python-matter-server/pull/882 , but you probably will know anyways.

marcelveldt commented 1 week ago

@agners -- Also, I was wondering if it might be a regression in 6.5.1 home-assistant-libs/python-matter-server#882 , but you probably will know anyways.

You would have a SEVERE issue with mdns if that cleanup is causing your nodes now to be offline.

What is the state of the nodes within the Matter Server's own UI ?

ThomasKoppensteiner commented 1 week ago

Hello, I think I have a similar issue with the 6.5.1 matter server. I don't see any nodes in the Web UI.

Bildschirmfoto 2024-09-19 um 22 02 45 Bildschirmfoto 2024-09-19 um 22 02 29
marcelveldt commented 1 week ago

Hello, I think I have a similar issue with the 6.5.1 matter server. I don't see any nodes in the Web UI.

Well, that is another issue. Maybe you (accidentally) reinstalled the whole Matter integration? You need to restore a backup to get your nodes back as the data is stored in the matter addon data.

3oris commented 1 week ago

@3oris (and others) when the device go unavailable, does reloading the integration helps? Settings -> Devices & services -> Matter -> Three dot menu -> Reload.

@agners -- So, it happened again, I restarted the integration , devices came back very very slowly. And only a few minutes after they were all back, they all disappeared again and the matter server was one again in the state of https://github.com/home-assistant/core/issues/124647 which I hadn't seen since the upgrade to 6.5.0b2.

Before I restarted the Matter server I took the logs: matter-server.log

tornenen commented 1 week ago

i guess my problem just flew away.. after 3 times i had this issue and restarting the matter server afterwards its now running since 2 days without problems.

3oris commented 1 week ago

@agners -- Also, I was wondering if it might be a regression in 6.5.1 home-assistant-libs/python-matter-server#882 , but you probably will know anyways.

You would have a SEVERE issue with mdns if that cleanup is causing your nodes now to be offline.

What is the state of the nodes within the Matter Server's own UI ?

@marcelveldt -- Will tell next time it happens.

ThomasKoppensteiner commented 1 week ago

You need to restore a backup to get your nodes back as the data is stored in the matter addon data.

@marcelveldt yes, I reinstalled the matter integration, but why does a reinstall not create a new node? Isn't this an issue?

If so should I create a new github issue?

ThomasKoppensteiner commented 1 week ago

Resetting my HomeAssistant VM to a previous state fixed the problem for me. Know I see the nodes again. Running version 6.4.1 now.

marcelveldt commented 1 week ago

@marcelveldt yes, I reinstalled the matter integration, but why does a reinstall not create a new node? Isn't this an issue?

If you reinstall the Matter integration, all data gets reset. So you basically destroyed your Matter network by uninstalling Matter from HA.

agners commented 1 week ago

Resetting my HomeAssistant VM to a previous state fixed the problem for me. Know I see the nodes again. Running version 6.4.1 now.

If you do a regular update, the nodes should not get lost. Can you try updating the add-on (again)? Worst case you should be able to restore 6.4.1.

That said, while the outcome of your issue is similar to the original poster, I don't think you suffer the same problem: In your case the store on the Matter Server lost all devices. If this happens with the second update attempt again, can you open a separate issue for this? This would be some type of add-on update issue :thinking:

agners commented 1 week ago

@agners -- So, it happened again, I restarted the integration , devices came back very very slowly. And only a few minutes after they were all back, they all disappeared again and the matter server was one again in the state of #124647 which I hadn't seen since the upgrade to 6.5.0b2.

Hm, that sounds like your whole system is completely overwhelmed somehow. I guess the Matter Server doesnt' respond in time for the Core, so the Core gives up communicating. I wonder if the Matter Server gets itself in a state where things just go awry.

Some messages I haven't seen so far, that sounds as if the message got corrupted :thinking:

2024-09-20 05:56:18.928 (Dummy-2) CHIP_ERROR [chip.native.EM] Dropping unexpected message of type 0x5 with protocolId (0, 1) and MessageCounter:141254017 on exchange 44431i with Node: <00000000000000E2, 1>

From what I can tell you run this on a Raspberry Pi 3? :thinking: Maybe this is just a bit too much for it to handle :cry:

marcelveldt commented 1 week ago

Resetting my HomeAssistant VM to a previous state fixed the problem for me. Know I see the nodes again. Running version 6.4.1 now.

If you do a regular update, the nodes should not get lost. Can you try updating the add-on (again)? Worst case you should be able to restore 6.4.1.

That said, while the outcome of your issue is similar to the original poster, I don't think you suffer the same problem: In your case the store on the Matter Server lost all devices. If this happens with the second update attempt again, can you open a separate issue for this? This would be some type of add-on update issue 🤔

He removed the Matter integration (to reinstall) but that also removed the matter add-on with its configuration. So that is what got his nodes lost. It reminds me that we should probably add a confirmation to HA when trying to remove Matter, Z-Wave or Zigbee that this may lead to loss of data without a backup.

3oris commented 6 days ago

@agners -- Also, I was wondering if it might be a regression in 6.5.1 home-assistant-libs/python-matter-server#882 , but you probably will know anyways.

You would have a SEVERE issue with mdns if that cleanup is causing your nodes now to be offline. What is the state of the nodes within the Matter Server's own UI ?

@marcelveldt -- Will tell next time it happens.

@marcelveldt -- they just all show offline in the Matter server add-on UI

3oris commented 6 days ago

@agners -- So, it happened again, I restarted the integration , devices came back very very slowly. And only a few minutes after they were all back, they all disappeared again and the matter server was one again in the state of #124647 which I hadn't seen since the upgrade to 6.5.0b2.

Hm, that sounds like your whole system is completely overwhelmed somehow. I guess the Matter Server doesnt' respond in time for the Core, so the Core gives up communicating. I wonder if the Matter Server gets itself in a state where things just go awry.

Some messages I haven't seen so far, that sounds as if the message got corrupted 🤔

�[32m2024-09-20 05:56:18.928�[0m (Dummy-2) �[1;30mCHIP_ERROR�[0m �[34m[chip.native.EM]�[0m �[31mDropping unexpected message of type 0x5 with protocolId (0, 1) and MessageCounter:141254017 on exchange 44431i with Node: <00000000000000E2, 1>�[0m

From what I can tell you run this on a Raspberry Pi 3? 🤔 Maybe this is just a bit too much for it to handle 😢

@agners -- no, this is Home Assistant running on HA Green. What I run on RPi3 is the OTBR which I run isolated from HA and compile myself in order to have some observability into the thread network via cli like channel monitor, TREL connectivity, child node distribution, link quality and stuff. By this I was also able to chose a thread channel with literally no wifi interference (as far as I can tell). But also, there is no difference on the matter fabric if I take the OTBR or any of the nest hubs out of the thread network. (I cannot take two or more TBRs out of the network though, because then total coverage is to low and the thread network gets overloaded.)

The points I am trying to make here:

ThomasKoppensteiner commented 4 days ago

If you do a regular update, the nodes should not get lost. Can you try updating the add-on (again)? Worst case you should be able to restore 6.4.1.

That said, while the outcome of your issue is similar to the original poster, I don't think you suffer the same problem: In your case the store on the Matter Server lost all devices. If this happens with the second update attempt again, can you open a separate issue for this? This would be some type of add-on update issue 🤔

Hey, I did another upgrade to 6.5.1 and this time it works as expected. The old nodes were visable right after the updated and were also available soon afterwards. Additionally I was able to add new matter devices as well (this was also not working before).

My issue is fixed. Thank you for the support.

AndreasMouskos commented 4 days ago

I have the same issue for my EVE matter decices (motion, door, energy) the exact time I updated my iPhone to iOS 18 and my homepod to latest version. Matter server is also 6.5.1, i have no pending updates on anything in HA and HA is also on latest version. My EVE devices work on EVE app and on Home app. I also cannot re-add them For some reason it keeps failing. Here are my logs:

2024-09-27 18:45:23.435 (MainThread) WARNING [matter_server.server.device_controller] <Node:2> Setup for node failed: Unable to establish CASE session with Node 2
2024-09-27 18:45:23.435 (MainThread) INFO [matter_server.server.device_controller] <Node:2> Retrying node setup in 60 seconds...
2024-09-27 18:45:27.963 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964488 on exchange 28264i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:45:34.630 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:45:37.635 (MainThread) INFO [matter_server.server.sdk] <Node:3> Attempting to establish CASE session... (attempt 2 of 2)
2024-09-27 18:46:19.261 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964489 on exchange 28265i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:46:23.438 (MainThread) INFO [matter_server.server.device_controller] <Node:2> Setting-up node...
2024-09-27 18:46:26.609 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:46:26.611 (MainThread) WARNING [matter_server.server.device_controller] <Node:3> Setup for node failed: Unable to establish CASE session with Node 3
2024-09-27 18:46:26.611 (MainThread) INFO [matter_server.server.device_controller] <Node:3> Retrying node setup in 60 seconds...
2024-09-27 18:47:04.691 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964490 on exchange 28266i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:47:12.163 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:47:26.613 (MainThread) INFO [matter_server.server.device_controller] <Node:3> Setting-up node...
2024-09-27 18:47:54.767 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964491 on exchange 28267i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:48:00.684 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:48:03.689 (MainThread) INFO [matter_server.server.sdk] <Node:2> Attempting to establish CASE session... (attempt 2 of 2)
2024-09-27 18:48:07.901 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964492 on exchange 28268i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:48:15.340 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:48:45.746 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964493 on exchange 28269i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:48:52.418 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:48:56.840 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964494 on exchange 28270i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:49:03.867 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:49:06.872 (MainThread) INFO [matter_server.server.sdk] <Node:3> Attempting to establish CASE session... (attempt 2 of 2)
2024-09-27 18:49:32.548 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964495 on exchange 28271i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:49:40.942 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:49:40.944 (MainThread) WARNING [matter_server.server.device_controller] <Node:2> Setup for node failed: Unable to establish CASE session with Node 2
2024-09-27 18:49:40.945 (MainThread) INFO [matter_server.server.device_controller] <Node:2> Retrying node setup in 60 seconds...
2024-09-27 18:49:47.108 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964496 on exchange 28272i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:49:55.714 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:49:55.716 (MainThread) WARNING [matter_server.server.device_controller] <Node:3> Setup for node failed: Unable to establish CASE session with Node 3
2024-09-27 18:49:55.716 (MainThread) INFO [matter_server.server.device_controller] <Node:3> Retrying node setup in 60 seconds...
2024-09-27 18:50:40.947 (MainThread) INFO [matter_server.server.device_controller] <Node:2> Setting-up node...
2024-09-27 18:50:55.719 (MainThread) INFO [matter_server.server.device_controller] <Node:3> Setting-up node...
2024-09-27 18:51:21.878 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964497 on exchange 28273i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:51:29.677 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:51:38.724 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964498 on exchange 28274i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:51:44.442 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:52:12.310 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964499 on exchange 28275i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:52:18.203 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:52:21.207 (MainThread) INFO [matter_server.server.sdk] <Node:2> Attempting to establish CASE session... (attempt 2 of 2)
2024-09-27 18:52:26.424 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964500 on exchange 28276i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:52:32.970 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:52:35.976 (MainThread) INFO [matter_server.server.sdk] <Node:3> Attempting to establish CASE session... (attempt 2 of 2)
2024-09-27 18:53:01.039 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964501 on exchange 28277i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:53:09.931 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:53:17.511 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964502 on exchange 28278i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:53:24.817 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:53:24.819 (MainThread) WARNING [matter_server.server.device_controller] <Node:3> Setup for node failed: Unable to establish CASE session with Node 3
2024-09-27 18:53:24.820 (MainThread) WARNING [matter_server.server.device_controller] <Node:3> Node setup not completed after 30 minutes, giving up.
2024-09-27 18:53:50.917 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964503 on exchange 28279i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:53:58.447 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:53:58.449 (MainThread) WARNING [matter_server.server.device_controller] <Node:2> Setup for node failed: Unable to establish CASE session with Node 2
2024-09-27 18:53:58.450 (MainThread) INFO [matter_server.server.device_controller] <Node:2> Retrying node setup in 60 seconds...
2024-09-27 18:54:58.457 (MainThread) INFO [matter_server.server.device_controller] <Node:2> Setting-up node...
2024-09-27 18:55:42.408 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964504 on exchange 28280i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:55:47.179 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:56:30.409 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964505 on exchange 28281i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:56:35.708 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:56:38.713 (MainThread) INFO [matter_server.server.sdk] <Node:2> Attempting to establish CASE session... (attempt 2 of 2)
2024-09-27 18:57:22.536 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964506 on exchange 28282i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:57:27.442 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:58:09.472 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:217964507 on exchange 28283i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-09-27 18:58:15.960 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-09-27 18:58:15.962 (MainThread) WARNING [matter_server.server.device_controller] <Node:2> Setup for node failed: Unable to establish CASE session with Node 2
2024-09-27 18:58:15.962 (MainThread) WARNING [matter_server.server.device_controller] <Node:2> Node setup not completed after 30 minutes, giving up.
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service matter-server: stopping
2024-09-27 19:15:30.842 (MainThread) WARNING [aiorun] Stopping the loop
2024-09-27 19:15:30.842 (MainThread) INFO [aiorun] Entering shutdown phase.
2024-09-27 19:15:30.842 (MainThread) INFO [aiorun] Executing provided shutdown_callback.
2024-09-27 19:15:30.842 (MainThread) INFO [matter_server.server.server] Stopping the Matter Server...
2024-09-27 19:15:30.843 (MainThread) INFO [matter_server.server.client_handler] [139977044284496] Connection closed by client
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
2024-09-27 19:15:30.848 (MainThread) INFO [matter_server.server.stack] Shutting down the Matter stack...
2024-09-27 19:15:30.848 (MainThread) CHIP_ERROR [chip.native.CTL] Shutting down the stack...
2024-09-27 19:15:30.850 (MainThread) CHIP_ERROR [chip.native.DIS] Failed to advertise records: src/inet/UDPEndPointImplSockets.cpp:416: OS Error 0x02000065: Network is unreachable
2024-09-27 19:15:30.853 (MainThread) CHIP_ERROR [chip.native.DIS] Failed to advertise records: src/lib/dnssd/minimal_mdns/Server.cpp:344: CHIP Error 0x00000046: No endpoint was available to send the message
2024-09-27 19:15:30.854 (MainThread) CHIP_ERROR [chip.native.DL] Inet Layer shutdown
2024-09-27 19:15:30.854 (MainThread) CHIP_ERROR [chip.native.DL] BLE shutdown
2024-09-27 19:15:30.854 (MainThread) CHIP_ERROR [chip.native.DL] System Layer shutdown
2024-09-27 19:15:30.855 (MainThread) INFO [aiorun] Waiting for executor shutdown.
2024-09-27 19:15:30.855 (MainThread) INFO [aiorun] Shutting down async generators
2024-09-27 19:15:30.855 (MainThread) INFO [aiorun] Closing the loop.
2024-09-27 19:15:30.855 (MainThread) INFO [aiorun] Leaving. Bye!
[16:15:31] INFO: matter-server service exited with code 0 (by signal 0).
s6-rc: info: service matter-server successfully stopped
s6-rc: info: service banner: stopping
s6-rc: info: service banner successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service banner: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started 
jvmahon commented 1 day ago

I've had this happen, but I concluded that the issue wasn't HA, it (or at least that the issue also involved other equipement). I found that to bring devices back online, I needed to reboot my Google Wifi Pro 6e WiFi routers (which also include my OTBRs).

Also, I have both Nest OTBRs and 3 Apple TV OTBRs and have found that if I leave the Nest enabled (and unplug the Apple TVs), all seems OK and stable, but if I add more than 1 Apple OTBR, it can cause instability.

I'm thinking there may be something going on when you have a mix of OTBRs from different vendors, in my case, particularly seems to happen when Apple OTBRs and Nest OTBRs try to join into a single thread network. But as long as Apple / Google Nest maintain separate thread networks, its more stable. None of this really makes much sense, but it points to issues that may be beyond HA. Also, entire setup destabilizes if I use Matter 1.0 devices (hello Eve!).

AndreasMouskos commented 1 day ago

I've had this happen, but I concluded that the issue wasn't HA, it (or at least that the issue also involved other equipement). I found that to bring devices back online, I needed to reboot my Google Wifi Pro 6e WiFi routers (which also include my OTBRs).

Also, I have both Nest OTBRs and 3 Apple TV OTBRs and have found that if I leave the Nest enabled (and unplug the Apple TVs), all seems OK and stable, but if I add more than 1 Apple OTBR, it can cause instability.

I'm thinking there may be something going on when you have a mix of OTBRs from different vendors, in my case, particularly seems to happen when Apple OTBRs and Nest OTBRs try to join into a single thread network. But as long as Apple / Google Nest maintain separate thread networks, its more stable. None of this really makes much sense, but it points to issues that may be beyond HA. Also, entire setup destabilizes if I use Matter 1.0 devices (hello Eve!).

Maybe your case is different because as I mentioned everything was fine for 1 year until I upgraded to homepod OS 18 and iOS18. The devices work on all my other apps except home assistant. I am also not able to re-add them anymore it keeps failing.