home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.73k stars 30.46k forks source link

Matter devices connected over Thread regularly become unavailable #127759

Open richardvinger opened 2 weeks ago

richardvinger commented 2 weeks ago

The problem

I have 10 Matter devices in HA that are connected over Thread, and they regularly become Unavailable. After some time, practically speaking often the next day, they might be available again, while other devices become Unavailable.

For example, I have 8x Eve Motionblinds Motor CM-08-E, and at the time of writing, 3 of them are unavailable. They might become available again tomorrow, while some of the others might go Unavailable.

Currently, my Nuki Smart Lock Pro is also Unavailable, which also regularly happens.

It is worth noting that the Nuki Smart Lock Pro otherwise works as expected, i.e. while using the physical controls on the device or using their dedicated app that connects over Bluetooth. The aforementioned motionblinds also open and close as expected when using their remote or when pulling the manual chain on each blind.

I also noticed that the Motionblinds are reachable over Ping, even when HA says they're Unavailable, indicating that they're still connected to the network.

The only Matter over Thread device I have that never appears to be Unavailable is the Aqara Door and Window Sensor, but it is unclear to me if that's a coincidence.

All devices connect to the network using Thread through the six border routers in the house, all Google devices.

Rebooting HA doesn't appear to help, e.g. after a reboot, the same devices that were Unavailable pre-reboot are still Unavailable post-reboot. It is unclear to me what triggers the devices to become Unavailable or available again.

What version of Home Assistant Core has the issue?

core-2024.10.1

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Matter (BETA)

Link to integration documentation on our website

https://www.home-assistant.io/integrations/matter/

Diagnostics information

config_entry-matter-87b01261c5c4665a59877b07ae8fa3a1.json config_entry-thread-1e3d9227b84cff78d91c311e5cd9fedd.json

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

home-assistant[bot] commented 2 weeks ago

Hey there @home-assistant/matter, mind taking a look at this issue as it has been labeled with an integration (matter) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `matter` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign matter` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


matter documentation matter source (message by IssueLinks)

Schubacker commented 2 weeks ago

I have similar behavior with my Eve devices. They are connected to Homekit as the primary instance. When I turn off my Home Assistant, the devices are immediately available in HomeKit, as soon as I turn the Home Assistant back on, the devices are not available with either HK or HA. My layman's impression suspects a problem with Thread/Matter, as the Hue Bridge is still controllable via HK. I've only had the system in operation for two days; strangely enough, everything worked perfectly the days before.

olingerc commented 2 weeks ago

I was just about to create an issue for the same symptom. I can add my situation here, but maybe it is unrealted.

I have 3 Nanoleaf LED strips that I connect using a skyconnect dongle. 2 of the 3 never have issues, the third one keep dropiing out but it comes back on its own. It also always comes back when I do "Reinterview" via the Matter Web UI. Here a 24h graph of the availability using the "update" entity image

What might be relevant is: the 2 working are directly connected to the dongle, while the not-working one uses one of the other 2 as bridge. Maybe the issue is with bridging? The working ones are called "TV" and "Dummy". The problematic one is called "Studio" and the bridging goes via "dummy". Distance between dummy and Studio: 5m.

Here a recent matter log right after doing "reinterview" the matter log

2024-10-07 07:54:02.859 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 07:54:02.860 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 07:54:02.863 (MainThread) INFO [matter_server.server.device_controller] <Node:5> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 11
2024-10-07 07:58:43.816 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.817 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.819 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.820 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.821 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.822 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.823 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.824 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.825 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.825 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.826 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.827 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.828 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.828 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 07:58:43.829 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 07:58:43.830 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:05:21.554 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:213965163 on exchange 10141i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 08:05:24.450 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 08:05:34.442 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:213965164 on exchange 10142i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 08:05:37.734 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 08:05:47.514 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:213965165 on exchange 10143i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 08:05:51.014 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 08:06:00.847 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:213965166 on exchange 10144i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 08:06:04.302 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 08:06:14.318 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:213965167 on exchange 10145i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 08:06:17.590 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 08:06:17.591 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 08:06:17.592 (MainThread) INFO [matter_server.server.device_controller] <Node:5> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 12
2024-10-07 08:06:17.593 (MainThread) INFO [matter_server.server.device_controller] <Node:5> Node considered offline, shutdown subscription
2024-10-07 08:12:47.774 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.776 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.777 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.778 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.779 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.780 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.781 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.782 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.783 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.784 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.784 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.785 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.786 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.786 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:12:47.787 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:12:47.787 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.561 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.564 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.565 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.566 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.567 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.568 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.569 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.570 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.571 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.571 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.572 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.572 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.573 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.573 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:26:51.574 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:26:51.575 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.470 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.471 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.472 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.473 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.473 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.474 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.475 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.475 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.476 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.477 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.478 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.478 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.479 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.479 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:40:55.480 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:40:55.480 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:43:15.862 (MainThread) INFO [matter_server.server.device_controller] Interviewing node: 5
2024-10-07 08:43:16.370 (MainThread) INFO [matter_server.server.device_controller.mdns] <Node:5> Discovered on mDNS
2024-10-07 08:43:16.371 (MainThread) INFO [matter_server.server.device_controller] <Node:5> Setting-up node...
2024-10-07 08:43:17.984 (MainThread) INFO [matter_server.server.device_controller] <Node:5> Setting up attributes and events subscription.
2024-10-07 08:43:19.855 (MainThread) INFO [matter_server.server.device_controller] <Node:5> Subscription succeeded with report interval [0, 60]
2024-10-07 08:54:59.388 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.390 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.391 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.392 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.393 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.394 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.395 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.396 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.397 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.398 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.400 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.401 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.401 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.402 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)
2024-10-07 08:54:59.403 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for SRV records)
2024-10-07 08:54:59.404 (Dummy-2) CHIP_ERROR [chip.native.DIS] DNSSD packet parsing failed (for non-srv records)

Fully updated Home Assistant OS

Core 2024.10.1
Supervisor 2024.09.1
Operating System 13.1
Frontend 20241002.2
richardvinger commented 2 weeks ago

It also always comes back when I do "Reinterview" via the Matter Web UI.

It looks like I cannot do "Reinterview" on Matter devices that are Unavailable, so perhaps our issues are still somewhat distinct. If I'm mistaken, I'd love to hear how to reinterview in this scenario.

agners commented 2 weeks ago

I also noticed that the Motionblinds are reachable over Ping, even when HA says they're Unavailable, indicating that they're still connected to the network.

This is a pattern which is very similar to what folks in #123835 describe, but mostly with Apple BR.

The problem appears to be related to when RF links are not strong enough. In theory, this should be alleviated by TREL, but this seems not to work (correctly) in practice of some sort.

How far are the MotionBlinds apart? Can you check how the Thread network behaves if you scale it down such that you are quite certain that the devices can reach each other (and then from there back up?)

Do you have the devices also joined to Google Home? It would be interesting how device availability behaves on Google Home (compared to Home Assistant).

agners commented 2 weeks ago

It looks like I cannot do "Reinterview" on Matter devices that are Unavailable, so perhaps our issues are still somewhat distinct. If I'm mistaken, I'd love to hear how to reinterview in this scenario.

Can you share the Matter Server add-on logs? I expect lots of Failed to Send CHIP MessageCounter:xx on exchange xyi with Node: <0000000000000000, 0> sendCount: 4 max retries: 4. Which essentially means that Matter tried to build a Matter CASE session but there was no response from the device.

This matches with the behavior other folks are seeing. The Matter CASE session cannot be built up (even though the device is seemingly on the network). Without a working CASE session, no commands, attribute reads or subscription work. So the device appears from a Matter perspective completely non-responsive.

That is until a different Thread router is being picked. This is what changes over time: Your host OS picks a different Thread border router.

richardvinger commented 2 weeks ago

Can you share the Matter Server add-on logs?

Follows:

2024-10-07 12:50:26.028 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Setting-up node...
2024-10-07 12:50:26.974 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Setting up attributes and events subscription.
2024-10-07 12:50:45.060 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription succeeded with report interval [0, 600]
2024-10-07 12:52:21.828 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:201213362 on exchange 30289r with Node: <000000000000000C, 1> sendCount: 4 max retries: 4
2024-10-07 13:02:12.158 (Dummy-2) CHIP_ERROR [chip.native.DMG] Subscription Liveness timeout with SubscriptionID = 0x00e963d8, Peer = 01:000000000000000C
2024-10-07 13:02:12.170 (MainThread) INFO [matter_server.server.device_controller] <Node:12> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 0
2024-10-07 13:02:27.291 (MainThread) INFO [matter_server.server.device_controller] <Node:12> Re-Subscription succeeded
2024-10-07 13:15:30.635 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:216967164 on exchange 29476r with Node: <0000000000000007, 1> sendCount: 4 max retries: 4
2024-10-07 13:25:20.821 (Dummy-2) CHIP_ERROR [chip.native.DMG] Subscription Liveness timeout with SubscriptionID = 0xa546b4ee, Peer = 01:0000000000000007
2024-10-07 13:25:20.828 (MainThread) INFO [matter_server.server.device_controller] <Node:7> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 0
2024-10-07 13:25:41.127 (MainThread) INFO [matter_server.server.device_controller] <Node:7> Re-Subscription succeeded
2024-10-07 15:35:16.153 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:137338153 on exchange 50820r with Node: <000000000000000F, 1> sendCount: 4 max retries: 4
2024-10-07 15:45:05.807 (Dummy-2) CHIP_ERROR [chip.native.DMG] Subscription Liveness timeout with SubscriptionID = 0xa73461b7, Peer = 01:000000000000000F
2024-10-07 15:45:05.812 (MainThread) INFO [matter_server.server.device_controller] <Node:15> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 0
2024-10-07 15:45:35.052 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16636i with Node: <000000000000000F, 1>
2024-10-07 15:45:35.055 (MainThread) INFO [matter_server.server.device_controller] <Node:15> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 1
2024-10-07 15:46:16.823 (MainThread) INFO [matter_server.server.device_controller] <Node:15> Re-Subscription succeeded
2024-10-07 17:50:51.792 (Dummy-2) CHIP_ERROR [chip.native.DMG] Subscription Liveness timeout with SubscriptionID = 0x3f69f4df, Peer = 01:000000000000000D
2024-10-07 17:50:51.801 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 0
2024-10-07 17:51:15.056 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878069 on exchange 16639i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:51:19.381 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:51:43.176 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878070 on exchange 16640i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:51:46.765 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:51:46.767 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 17:51:46.770 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 1
2024-10-07 17:52:15.493 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878071 on exchange 16641i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:52:19.953 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:52:42.965 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878072 on exchange 16642i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:52:47.338 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:52:47.340 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 17:52:47.343 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 2
2024-10-07 17:52:47.343 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Marked node as unavailable
2024-10-07 17:53:13.343 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878073 on exchange 16643i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:53:18.894 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:53:41.120 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878074 on exchange 16644i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:53:46.280 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:53:46.281 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 17:53:46.283 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 3
2024-10-07 17:54:14.443 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878075 on exchange 16645i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:54:19.896 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:54:42.338 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878076 on exchange 16646i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 17:54:47.281 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 17:54:47.282 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 17:54:47.285 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 4
2024-10-07 17:55:38.051 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:241291063 on exchange 16648i with Node: <000000000000000D, 1> sendCount: 4 max retries: 4
2024-10-07 17:55:47.407 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16648i with Node: <000000000000000D, 1>
2024-10-07 17:55:47.410 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 5
2024-10-07 17:56:34.800 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:12523156 on exchange 16650i with Node: <000000000000000D, 1> sendCount: 4 max retries: 4
2024-10-07 17:56:39.985 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16650i with Node: <000000000000000D, 1>
2024-10-07 17:56:39.988 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 6
2024-10-07 17:57:46.635 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Re-Subscription succeeded
2024-10-07 18:58:38.339 (MainThread) INFO [matter_server.server.device_controller] <Node:4> No new update found.
2024-10-07 18:58:38.351 (MainThread) INFO [matter_server.server.device_controller] <Node:13> No new update found.
2024-10-07 18:58:38.354 (MainThread) INFO [matter_server.server.device_controller] <Node:10> No new update found.
2024-10-07 18:58:38.356 (MainThread) INFO [matter_server.server.device_controller] <Node:12> No new update found.
2024-10-07 18:58:38.360 (MainThread) INFO [matter_server.server.device_controller] <Node:9> No new update found.
2024-10-07 18:58:38.361 (MainThread) INFO [matter_server.server.device_controller] <Node:15> No new update found.
2024-10-07 18:58:38.363 (MainThread) INFO [matter_server.server.device_controller] <Node:7> No new update found.
2024-10-07 18:58:38.364 (MainThread) INFO [matter_server.server.device_controller] <Node:8> No new update found.
2024-10-07 18:58:38.381 (MainThread) INFO [matter_server.server.device_controller] <Node:2> No new update found.
2024-10-07 19:57:51.796 (Dummy-2) CHIP_ERROR [chip.native.DMG] Subscription Liveness timeout with SubscriptionID = 0xf79baa7d, Peer = 01:000000000000000D
2024-10-07 19:57:51.801 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 0
2024-10-07 19:58:13.968 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878083 on exchange 16653i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 19:58:19.374 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 19:58:42.249 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878084 on exchange 16654i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 19:58:46.759 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 19:58:46.760 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 19:58:46.763 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 1
2024-10-07 19:59:17.117 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878085 on exchange 16655i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 19:59:22.621 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 19:59:45.341 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878086 on exchange 16656i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 19:59:50.006 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 19:59:50.007 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 19:59:50.009 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 2
2024-10-07 19:59:50.010 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Marked node as unavailable
2024-10-07 20:00:16.575 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878087 on exchange 16657i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 20:00:22.156 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 20:00:44.515 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:193878088 on exchange 16658i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 20:00:49.537 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 4
2024-10-07 20:00:49.538 (Dummy-2) CHIP_ERROR [chip.native.DMG] Failed to establish CASE for re-subscription with error 'src/protocols/secure_channel/CASESession.cpp:560: CHIP Error 0x00000032: Timeout'
2024-10-07 20:00:49.541 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 3
2024-10-07 20:01:48.875 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16660i with Node: <000000000000000D, 1>
2024-10-07 20:01:48.877 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 4
2024-10-07 20:02:44.911 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16662i with Node: <000000000000000D, 1>
2024-10-07 20:02:44.914 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 5
2024-10-07 20:03:36.839 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16664i with Node: <000000000000000D, 1>
2024-10-07 20:03:36.842 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 6
2024-10-07 20:04:52.206 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16666i with Node: <000000000000000D, 1>
2024-10-07 20:04:52.207 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 7
2024-10-07 20:06:42.717 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16668i with Node: <000000000000000D, 1>
2024-10-07 20:06:42.720 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 8
2024-10-07 20:09:49.684 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16670i with Node: <000000000000000D, 1>
2024-10-07 20:09:49.687 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 9
2024-10-07 20:13:11.683 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16672i with Node: <000000000000000D, 1>
2024-10-07 20:13:11.685 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 10
2024-10-07 20:21:59.431 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16674i with Node: <000000000000000D, 1>
2024-10-07 20:21:59.434 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 11
2024-10-07 20:28:14.935 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:153270039 on exchange 16676i with Node: <000000000000000D, 1> sendCount: 4 max retries: 4
2024-10-07 20:28:25.629 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 16676i with Node: <000000000000000D, 1>
2024-10-07 20:28:25.632 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Subscription failed with CHIP Error 0x00000032: Timeout, resubscription attempt 12
2024-10-07 20:28:25.632 (MainThread) INFO [matter_server.server.device_controller] <Node:13> Node considered offline, shutdown subscription
richardvinger commented 2 weeks ago

How far are the MotionBlinds apart? Can you check how the Thread network behaves if you scale it down such that you are quite certain that the devices can reach each other (and then from there back up?)

7 of the motionblinds are in the kitchen, all adjacent to each other, hardly 10 centimeters between each.

As a test, I put my phone next to the motionblind that is farthest from the border router and did a speedtest on 5 GHz Wi-Fi, which resulted in 537 mbit down and 387 mbit up. It feels safe to assume that Thread coverage (on 2.4 Ghz) should be excellent in the same area.

The Nuki device is 2 meters away from the border router, and it also regularly appears as Unavailable in HA.

Do you have the devices also joined to Google Home? It would be interesting how device availability behaves on Google Home (compared to Home Assistant).

I have not, but will try to add some to see if there's a difference.

richardvinger commented 2 weeks ago

Do you have the devices also joined to Google Home? It would be interesting how device availability behaves on Google Home (compared to Home Assistant).

I just added 5 of the 8 Motionblinds to Google Home. Coincidentally, one of them became unavailable a short while later in HA.

While unavailable in HA, Google Home had no problem controlling the device, e.g. opening or closing the blind.

richardvinger commented 2 weeks ago

That is until a different Thread router is being picked. This is what changes over time: Your host OS picks a different Thread border router.

I took this to assume that rebooting the entire Nest Wifi Pro network would make all Matter nodes available again in HA (thinking that rebooting will force the host OS to pick a different Thread border router since they all go offline temporarily), but after that reboot, the number of unavailable devices in HA only increased.

Currently, 6 out of 10 Matter devices are marked as unavailable.

Sensibly, the reboot led to multiple Timeout waiting for mDNS resolution messages in the Matter add-on log file.

richardvinger commented 2 weeks ago

FWIW, after upgrading the Matter add-on to 6.6.0, all Matter devices became Unavailable in HA.

This is still the case 5 minutes after the add-on started again.

-----------------------------------------------------------
 Add-on: Matter Server
 Matter WebSocket Server for Home Assistant Matter support.
-----------------------------------------------------------
 Add-on version: 6.6.0
 You are running the latest version of this add-on.
 System: Home Assistant OS 13.1  (amd64 / generic-x86-64)
 Home Assistant Core: 2024.10.1
 Home Assistant Supervisor: 2024.10.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service matter-server: starting
s6-rc: info: service matter-server successfully started
s6-rc: info: service legacy-services: starting
[22:03:11] INFO: Starting Matter Server...
s6-rc: info: service legacy-services successfully started
[22:03:12] INFO: Using 'enp2s0' as primary network interface.
[22:03:12] INFO: Successfully send discovery information to Home Assistant.
2024-10-07 22:03:15.150 (MainThread) INFO [matter_server.server.stack] Initializing CHIP/Matter Logging...
2024-10-07 22:03:15.150 (MainThread) INFO [matter_server.server.stack] Initializing CHIP/Matter Controller Stack...
[1728331395.238248][126:126] CHIP:CTL: Setting attestation nonce to random value
[1728331395.238627][126:126] CHIP:CTL: Setting CSR nonce to random value
[1728331395.241955][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /tmp/chip_kvs
[1728331395.242426][126:126] CHIP:DL: writing settings to file (/tmp/chip_kvs-CjlQeK)
[1728331395.242516][126:126] CHIP:DL: renamed tmp file to file (/tmp/chip_kvs)
[1728331395.242813][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /data/chip_factory.ini
[1728331395.243621][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /data/chip_config.ini
[1728331395.243869][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /data/chip_counters.ini
[1728331395.244308][126:126] CHIP:DL: writing settings to file (/data/chip_counters.ini-lt3DEJ)
[1728331395.244852][126:126] CHIP:DL: renamed tmp file to file (/data/chip_counters.ini)
[1728331395.245125][126:126] CHIP:DL: NVS set: chip-counters/reboot-count = 82 (0x52)
[1728331395.245923][126:126] CHIP:DL: Got Ethernet interface: enp2s0
[1728331395.246371][126:126] CHIP:DL: Found the primary Ethernet interface:enp2s0
[1728331395.246648][126:126] CHIP:DL: Got WiFi interface: wlp1s0
[1728331395.246700][126:126] CHIP:DL: Failed to reset WiFi statistic counts
2024-10-07 22:03:15.247 (MainThread) INFO [chip.storage] Initializing persistent storage from file: /data/chip.json
2024-10-07 22:03:15.247 (MainThread) INFO [chip.storage] Loading configuration from /data/chip.json...
2024-10-07 22:03:15.372 (MainThread) INFO [chip.CertificateAuthority] Loading certificate authorities from storage...
2024-10-07 22:03:15.373 (MainThread) INFO [chip.CertificateAuthority] New CertificateAuthority at index 1
2024-10-07 22:03:15.374 (MainThread) INFO [chip.CertificateAuthority] Loading fabric admins from storage...
2024-10-07 22:03:15.374 (MainThread) INFO [chip.FabricAdmin] New FabricAdmin: FabricId: 0x0000000000000002, VendorId = 0x134B
2024-10-07 22:03:15.375 (MainThread) INFO [matter_server.server.stack] CHIP Controller Stack initialized.
2024-10-07 22:03:15.375 (MainThread) INFO [matter_server.server.server] Starting the Matter Server...
2024-10-07 22:03:15.380 (MainThread) INFO [matter_server.server.helpers.paa_certificates] Fetching the latest PAA root certificates from DCL.
2024-10-07 22:03:17.370 (MainThread) INFO [matter_server.server.helpers.paa_certificates] Fetched 63 PAA root certificates from DCL.
2024-10-07 22:03:17.375 (MainThread) INFO [matter_server.server.helpers.paa_certificates] Fetching the latest PAA root certificates from Git.
2024-10-07 22:03:17.797 (MainThread) INFO [matter_server.server.helpers.paa_certificates] Fetched 2 PAA root certificates from Git.
2024-10-07 22:03:17.799 (MainThread) INFO [chip.FabricAdmin] Allocating new controller with CaIndex: 1, FabricId: 0x0000000000000002, NodeId: 0x000000000001B669, CatTags: []
2024-10-07 22:03:18.028 (Dummy-2) CHIP_ERROR [chip.native.DIS] Failed to advertise records: src/inet/UDPEndPointImplSockets.cpp:416: OS Error 0x02000065: Network is unreachable
2024-10-07 22:03:18.034 (Dummy-2) CHIP_ERROR [chip.native.DL] Long dispatch time: 232 ms, for event type 2
2024-10-07 22:03:18.047 (MainThread) INFO [matter_server.server.vendor_info] Loading vendor info from storage.
2024-10-07 22:03:18.054 (MainThread) INFO [matter_server.server.vendor_info] Loaded 246 vendors from storage.
2024-10-07 22:03:18.054 (MainThread) INFO [matter_server.server.vendor_info] Fetching the latest vendor info from DCL.
2024-10-07 22:03:18.218 (MainThread) INFO [matter_server.server.vendor_info] Fetched 247 vendors from DCL.
2024-10-07 22:03:18.219 (MainThread) INFO [matter_server.server.vendor_info] Saving vendor info to storage.
2024-10-07 22:03:18.279 (MainThread) INFO [matter_server.server.device_controller] Loaded 10 nodes from stored configuration
2024-10-07 22:03:18.294 (MainThread) INFO [matter_server.server.server] Matter Server successfully initialized.
2024-10-07 22:03:19.022 (MainThread) INFO [matter_server.server.device_controller.mdns] <Node:9> Discovered on mDNS
2024-10-07 22:03:19.022 (MainThread) INFO [matter_server.server.device_controller] <Node:9> Setting-up node...
2024-10-07 22:03:42.405 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:234229883 on exchange 8012i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 22:03:46.607 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-10-07 22:04:00.138 (MainThread) INFO [matter_server.server.device_controller] <Node:8> Node could not be discovered on the network, returning cached IP's
2024-10-07 22:04:00.143 (MainThread) INFO [matter_server.server.device_controller] <Node:8> Node could not be discovered on the network, returning cached IP's
2024-10-07 22:04:03.394 (MainThread) INFO [matter_server.server.device_controller.mdns] <Node:4> Discovered on mDNS
2024-10-07 22:04:03.395 (MainThread) INFO [matter_server.server.device_controller] <Node:4> Setting-up node...
2024-10-07 22:04:03.473 (MainThread) INFO [matter_server.server.device_controller] <Node:10> Node could not be discovered on the network, returning cached IP's
2024-10-07 22:04:03.481 (MainThread) INFO [matter_server.server.device_controller] <Node:10> Node could not be discovered on the network, returning cached IP's
2024-10-07 22:04:10.812 (MainThread) INFO [matter_server.server.device_controller] <Node:4> Setting up attributes and events subscription.
2024-10-07 22:04:13.990 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-10-07 22:04:16.993 (MainThread) INFO [matter_server.server.sdk] <Node:9> Attempting to establish CASE session... (attempt 2 of 2)
2024-10-07 22:04:24.876 (MainThread) INFO [matter_server.server.device_controller] <Node:9> Setting up attributes and events subscription.
2024-10-07 22:04:41.973 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:34315384 on exchange 8017i with Node: <0000000000000009, 1> sendCount: 4 max retries: 4
2024-10-07 22:04:52.496 (Dummy-2) CHIP_ERROR [chip.native.DMG] Time out! failed to receive report data from Exchange: 8017i with Node: <0000000000000009, 1>
2024-10-07 22:04:52.498 (MainThread) WARNING [matter_server.server.device_controller] <Node:9> Unable to subscribe to Node: src/app/ReadClient.cpp:682: CHIP Error 0x00000032: Timeout
2024-10-07 22:04:52.498 (MainThread) INFO [matter_server.server.device_controller] <Node:9> Retrying node setup in 60 seconds...
2024-10-07 22:04:56.880 (MainThread) INFO [matter_server.server.device_controller.mdns] <Node:8> Discovered on mDNS
2024-10-07 22:04:56.880 (MainThread) INFO [matter_server.server.device_controller] <Node:8> Setting-up node...
2024-10-07 22:05:04.354 (MainThread) WARNING [matter_server.server.device_controller] <Node:4> Unable to subscribe to Node: IM Error 0x00000580: General error: 0x80 (INVALID_ACTION)
2024-10-07 22:05:04.355 (MainThread) INFO [matter_server.server.device_controller] <Node:4> Retrying node setup in 60 seconds...
2024-10-07 22:05:16.146 (MainThread) INFO [matter_server.server.device_controller.mdns] <Node:7> Discovered on mDNS
2024-10-07 22:05:16.147 (MainThread) INFO [matter_server.server.device_controller] <Node:7> Setting-up node...
2024-10-07 22:05:20.225 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:234229892 on exchange 8018i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 22:05:24.465 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-10-07 22:05:38.806 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:234229893 on exchange 8019i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 22:05:43.732 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
2024-10-07 22:05:46.825 (Dummy-2) CHIP_ERROR [chip.native.EM] Failed to Send CHIP MessageCounter:234229894 on exchange 8020i with Node: <0000000000000000, 0> sendCount: 4 max retries: 4
2024-10-07 22:05:51.849 (Dummy-2) CHIP_ERROR [chip.native.SC] CASESession timed out while waiting for a response from the peer. Current state was 1
agners commented 2 weeks ago

What hardware/board are you using to run Home Assistant on? What Home Assistant OS version are you using?

If you have HAOS shell access (using login on the system console) you can list the routes using

ip -6 route

There should be one entry with 6 border routers. What you can try is fetch the IPv6 address of one of the non working (but still pingable) devices, and remote that BR from the list using:

ip route get <ip>
ip route del <network> <via from get command>
richardvinger commented 2 weeks ago

What hardware/board are you using to run Home Assistant on? What Home Assistant OS version are you using?

It's a Minisforum PC with a Celeron CPU (so x64), running the latest HA OS (13.1) and Core (2024.10.1).

If you have HAOS shell access

I don't, AFAIK. I guess the easiest way to get there is by connecting the host computer to a physical screen and keyboard?

richardvinger commented 1 week ago

FWIW, Google Home too is now reporting some of the motionblinds to be offline, while HA sees some of them as online. It looks then like both are hitting the same issue, just not at the same time. Not sure how to advance to get this right. I'm not even sure if this should be considered an HA issue anymore.

Leo2442926161 commented 1 week ago

image

I have 7 Heiman thread devices with Apple board router, it didn't become unavailable.

my core of HA is 2024.10.0b1

tcarmelveilleux commented 1 week ago

I would recommend that there be a diagnostics mode in HA that tries to do pings of 1024 bytes (instead of 32 bytes) for Thread. This would allow seeing the actual latency when large amounts of fragmentation are included.

SolimanGit commented 1 day ago

Well, here to tell my story as well x) Same here thread devices becoming unavailable randomly, when using but I can see a pattern where every night, it becomes unavailable.

Got Yellow with on board thread chip and additionnal SonOff chip for zigbee. Got 5 thread devices. Everything is up to date and i was hoping to have a good experience to be fair since thread and matter are at the center of Yellow.

Screen is matter server during the black out

image