Closed Anto79-ops closed 1 year ago
Restart the addon and very likely its starting OK. Its one problem that making the mDNS eating one CPU for 100% and its always being in the code but the reason is still not found.
yes, definitely restarting the addon solves the issues. thanks.
As expected restarting the add-on is only a temporary fix.
It's kind of strange it seems to happen every time I have other add-ons being updated. I don't know if that makes sense.
I just wanted to provide an update that I'm still seeing this issue from time to time. However it's kind of strange it's like every time I'm starting or restarting other add-ons this add-on all of a sudden has elevated CPU usage.
For example today I was reconfiguring the Piper add-on (changing the voice settings) and I stopped Piper and restarted Piper and I noticed my CPU was elevated even after several monutes,and it wasn't because of Piper it was because of the Silicon Labs Multiprotocol addon. I had to restart the Silicon Labs Multiprotocol addon and the cpu calmed again.
Again it just seems to be a bizarre link with other add-ons that is causing this add-on to have elevated CPU.
Why would restarting other other add-ons affect the Silicon Labs Multiprotocol addon?
Just wanted to mention another instance of this happening there was updates for 4 other of my add-ons (i.e. zwave, adgaurd, sqlite, etc), and again I noticed after those updates were done my CPU was elevated and it was silicon multi-protocol add-on that was elevated. After a restart of the addon,it went back down.
Anybody know why this is happening? Is there some communication that's broken during updates and the multi-protocol add-on does not recover from?
Then having CLI to the Linux OS its the mDNS in the OS that is having problems like its something being locked and all processes that is using it can triggering it and its looks the addon is using it most of all so its the most likely getting the problems.
that's an interesting point. An idea might be:
"maybe supervisor is doing some type of USB scan which is causing the adapter to malfunction"
I do have logs but I suspect these logs were created because I restarted the Silicon Labs addon. Next time I will keep an eye on the logs before restarting that addon to confirm.
Logger: bellows.uart
Source: runner.py:179
First occurred: August 4, 2023 at 6:45:10 PM (4 occurrences)
Last logged: 11:06:20 AM
Lost serial connection: IndexError('index out of range')
Lost serial connection: ConnectionResetError('Remote server closed connection')
Logger: bellows.ezsp
Source: runner.py:179
First occurred: August 4, 2023 at 6:45:10 PM (3 occurrences)
Last logged: 11:06:20 AM
NCP entered failed state. Requesting APP controller restart
Logger: bellows.zigbee.application
Source: /usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py:643
First occurred: August 6, 2023 at 9:08:21 AM (10 occurrences)
Last logged: 11:06:42 AM
ControllerApplication reset unsuccessful: ConnectionRefusedError(111, "Connect call failed ('172.30.32.1', 9999)")
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 640, in _reset_controller_loop
await self._reset_controller()
File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 661, in _reset_controller
await self.connect()
File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 133, in connect
self._ezsp = await bellows.ezsp.EZSP.initialize(self.config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 164, in initialize
await ezsp.connect(use_thread=zigpy_config[conf.CONF_USE_THREAD])
File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 181, in connect
self._gw = await bellows.uart.connect(self._config, self, use_thread=use_thread)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 414, in connect
protocol, _ = await _connect(config, application)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/bellows/uart.py", line 385, in _connect
transport, protocol = await zigpy.serial.create_serial_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/zigpy/serial.py", line 31, in create_serial_connection
transport, protocol = await loop.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1085, in create_connection
raise exceptions[0]
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1069, in create_connection
sock = await self._connect_sock(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 973, in _connect_sock
await self.sock_connect(sock, address)
File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 628, in sock_connect
return await fut
^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 668, in _sock_connect_cb
raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 111] Connect call failed ('172.30.32.1', 9999)
Logger: bellows.ezsp.protocol
Source: runner.py:179
First occurred: August 6, 2023 at 9:08:49 AM (2 occurrences)
Last logged: 11:06:48 AM
Couldn't set EzspConfigId.CONFIG_PACKET_BUFFER_COUNT=255 configuration value: EzspStatus.ERROR_INVALID_ID
I can see you have updated the addon then the last error is new for the last version but its not critical but the last Zigbee have more fixes for communication and reset of the RCP that can helping getting the communication working better with the RCP. I dint think its directly connected with USB problems then i was having it one one NanoPi Neo that i was not using the USB only internal comports.
you're right. I found this upcoming update for the last error:
https://github.com/home-assistant/core/issues/97967
hopefully this will solve the issue. thanks
@MattWestb @Anto79-ops I have a build of the add-on with a new mDNSResponder version (1790.80.10). The mDNSResponder needed quite some patching since the new version of the upstream version of the mDNSResponder was buggy, so I am not 100% confident. Also the bump to this new version in upstream OpenThread didn't got merged yet (see https://github.com/openthread/ot-br-posix/pull/1919).
However, I have it running in my production for a couple of days stable here.
You can try this "preview" with access to the OS shell (on port 22222, or the keyboard/screen terminal) using:
docker pull agners/aarch64-addon-silabs-multiprotocol:2.3.2
docker tag agners/aarch64-addon-silabs-multiprotocol:2.3.2 homeassistant/aarch64-addon-silabs-multiprotocol:2.3.1
Then stop and start the add-on (do not use restart, as that doesn't recreate the container).
Note: This is a hack, essentially it overwrites the current version with a different version :blush: The image will get overwritten next time the add-on receives an upgrade. But it works for testing.
Would be interesting to see if this resolves the mDNS CPU spikes for you.
Hi @agners
Thank you very much for this opportunity to try out the patch.
I'd be happy to install it.
Just a quick a question for installation, I have both access to root from ssh port 2222 and keyboard/screen terminal addon
Do l simply just type the 2 lines in the directory it's in the moment I log in. I.e. pwd is /root
Thanks
Do l simply just type the 2 lines in the directory it's in the moment I log in. I.e. pwd is /root
Yes, it doesn't matter in which directory.
Thanks!
The addon still says its 2.3.1 but I suspect that is expected but just wanted to confirm that.
Here is the cli output for the process
I stopped and started the addon.
The 64 bit is only running Ziogbeed but the logs looks like:
[18:22:02] INFO: The otbr-agent is disabled.
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
cont-init: info: running /etc/cont-init.d/check-cpcd-shm.sh
cont-init: info: /etc/cont-init.d/check-cpcd-shm.sh exited 0
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service banner: starting
-----------------------------------------------------------
Add-on: Silicon Labs Multiprotocol
Zigbee and OpenThread multiprotocol add-on
-----------------------------------------------------------
Add-on version: 2.3.1
You are running the latest version of this add-on.
System: Armbian 23.08.0-trunk jammy (aarch64 / raspberrypi3-64)
Home Assistant Core: 2023.8.0
Home Assistant Supervisor: 2023.07.1
-----------------------------------------------------------
Please, share the above information when looking for help
or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[18:22:11] INFO: Flashing firmware is disabled
s6-rc: info: service universal-silabs-flasher successfully started
s6-rc: info: service cpcd-config: starting
[18:22:15] INFO: Generating cpcd configuration.
s6-rc: info: service cpcd-config successfully started
s6-rc: info: service cpcd: starting
[18:22:17] INFO: Starting cpcd...
WARNING in function 'main' in file /usr/src/cpc-daemon/main.c at line #186 : Running CPCd as 'root' is not recommended. Proceed at your own risk.
s6-rc: info: service cpcd successfully started
s6-rc: info: service zigbeed: starting
s6-rc: info: service zigbeed successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
[18:22:18] INFO: Starting zigbeed...
[18:22:17:286153] Info : [CPCd v4.3.1.0] [Library API v3] [RCP Protocol v4]
[18:22:17:286342] Info : Git commit: 133b29678b3d0bc7578e098d2f46b4d5bcd2ebb4 / branch:
[18:22:17:286349] Info : Sources hash: ff8300587e7e4ab1def7a89a272c0baef32f9eb3bff9b0ba06b94e655d652367
[18:22:17:286379] WARNING : In function 'main' in file /usr/src/cpc-daemon/main.c at line #186 : Running CPCd as 'root' is not recommended. Proceed at your own risk.
[18:22:17:286431] Info : Reading cli arguments
[18:22:17:286443] Info : /usr/local/bin/cpcd
[18:22:17:287730] Info : Reading configuration
[18:22:17:287752] Info : file_path = /usr/local/etc/cpcd.conf
[18:22:17:287757] Info : instance_name = cpcd_0
[18:22:17:287763] Info : socket_folder = /dev/shm
[18:22:17:287768] Info : operation_mode = MODE_NORMAL
[18:22:17:287771] Info : use_encryption = false
[18:22:17:287776] Info : binding_key_file = /etc/binding-key.key
[18:22:17:287781] Info : stdout_tracing = false
[18:22:17:287786] Info : file_tracing = false
[18:22:17:287791] Info : lttng_tracing = false
[18:22:17:287796] Info : enable_frame_trace = false
[18:22:17:287800] Info : traces_folder = /dev/shm/cpcd-traces
[18:22:17:287806] Info : bus = UART
[18:22:17:287811] Info : uart_baudrate = 230400
[18:22:17:287816] Info : uart_hardflow = true
[18:22:17:287821] Info : uart_file = /dev/ttyUSB0
[18:22:17:287826] Info : fu_recovery_pins_enabled = false
[18:22:17:287831] Info : fu_connect_to_bootloader = false
[18:22:17:287836] Info : fu_enter_bootloader = false
[18:22:17:287841] Info : restart_cpcd = false
[18:22:17:287846] Info : application_version_validation = false
[18:22:17:287850] Info : print_secondary_versions_and_exit = false
[18:22:17:287855] Info : use_noop_keep_alive = false
[18:22:17:287860] Info : reset_sequence = true
[18:22:17:287864] Info : stats_interval = 0
[18:22:17:287869] Info : rlimit_nofile = 2000
[18:22:17:287873] Info : ENCRYPTION IS DISABLED
[18:22:17:287878] Info : Starting daemon in normal mode
[18:22:17:300198] Info : Connecting to Secondary...
[18:22:17:383456] Info : RX capability is 256 bytes
[18:22:17:383488] Info : Connected to Secondary
[18:22:17:388492] Info : Secondary Protocol v4
[18:22:17:398550] Info : Secondary CPC v4.3.1
[18:22:17:403432] Info : Secondary bus bitrate is 230400
[18:22:17:413512] Info : Secondary APP vUNDEFINED
[18:22:17:413777] Info : Daemon startup was successful. Waiting for client connections
[18:22:19:159791] Info : New client connection using library v4.3.1.0
[18:22:19:165697] Info : Opened connection socket for ep#12
[18:22:19:165935] Info : Endpoint socket #12: Client connected. 1 connections
Listening on port 9999 for connection...
Accepting connection.
Accepted connection 7.
I shall trying updating the 2 32 bits also but i must finding the URL for doing it so need looking in docker for it.
Edit: I was doing one restart so it was not OK and have one start stop and updating the log.
I was trying :
root@rk322x-box:~# docker pull agners/armv7-addon-silabs-multiprotocol:2.3.2
Error response from daemon: manifest for agners/armv7-addon-silabs-multiprotocol:2.3.2 not found: manifest unknown: manifest unknown
so i have doing it wrong or its the arm7 not build in the test repro.
The addon still says its 2.3.1 but I suspect that is expected but just wanted to confirm that.
Yes, that is expected. The add-on itself reads the version from the container image tag.
The addon still says its 2.3.1 but I suspect that is expected but just wanted to confirm that.
Yes, that is expected. The add-on itself reads the version from the container image tag.
Awesome. By the way there was an update for Z-WaveJS UI today (another add-on that I have) and I did not experience the elevated CPU after updating that add-on.
So far so good 👍
so i have doing it wrong or its the arm7 not build in the test repro.
The armv7 image was missing. I just built and pushed it.
RK3228 on armbian Ubuntu 32 is up and running and looks working OK:
[07:28:03] INFO: Starting cpcd...
WARNING in function 'main' in file /usr/src/cpc-daemon/main.c at line #186 : Running CPCd as 'root' is not recommended. Proceed at your own risk.
s6-rc: info: service cpcd successfully started
s6-rc: info: service zigbeed: starting
s6-rc: info: service otbr-agent: starting
s6-rc: info: service zigbeed successfully started
[07:28:03:151076] Info : [CPCd v4.3.1.0] [Library API v3] [RCP Protocol v4]
[07:28:03:151533] Info : Git commit: 133b29678b3d0bc7578e098d2f46b4d5bcd2ebb4 / branch:
[07:28:03:151555] Info : Sources hash: ff8300587e7e4ab1def7a89a272c0baef32f9eb3bff9b0ba06b94e655d652367
[07:28:03:151647] WARNING : In function 'main' in file /usr/src/cpc-daemon/main.c at line #186 : Running CPCd as 'root' is not recommended. Proceed at your own risk.
[07:28:03:152535] Info : Reading cli arguments
[07:28:03:152621] Info : /usr/local/bin/cpcd
[07:28:03:163435] Info : Reading configuration
[07:28:03:163494] Info : file_path = /usr/local/etc/cpcd.conf
[07:28:03:163510] Info : instance_name = cpcd_0
[07:28:03:163522] Info : socket_folder = /dev/shm
[07:28:03:163533] Info : operation_mode = MODE_NORMAL
[07:28:03:163543] Info : use_encryption = false
[07:28:03:163553] Info : binding_key_file = /etc/binding-key.key
[07:28:03:163564] Info : stdout_tracing = false
[07:28:03:163574] Info : file_tracing = false
[07:28:03:163584] Info : lttng_tracing = false
[07:28:03:163593] Info : enable_frame_trace = false
[07:28:03:163603] Info : traces_folder = /dev/shm/cpcd-traces
[07:28:03:163613] Info : bus = UART
[07:28:03:163623] Info : uart_baudrate = 230400
[07:28:03:163633] Info : uart_hardflow = true
[07:28:03:163643] Info : uart_file = /dev/ttyUSB0
[07:28:03:163654] Info : fu_recovery_pins_enabled = false
[07:28:03:163664] Info : fu_connect_to_bootloader = false
[07:28:03:163675] Info : fu_enter_bootloader = false
[07:28:03:163684] Info : restart_cpcd = false
[07:28:03:163694] Info : application_version_validation = false
[07:28:03:163704] Info : print_secondary_versions_and_exit = false
[07:28:03:163714] Info : use_noop_keep_alive = false
[07:28:03:163723] Info : reset_sequence = true
[07:28:03:163733] Info : stats_interval = 0
[07:28:03:163743] Info : rlimit_nofile = 2000
[07:28:03:163754] Info : ENCRYPTION IS DISABLED
[07:28:03:163763] Info : Starting daemon in normal mode
[07:28:03:178122] Info : Connecting to Secondary...
[07:28:03:267244] Info : RX capability is 256 bytes
[07:28:03:267319] Info : Connected to Secondary
[07:28:03:272287] Info : Secondary Protocol v4
[07:28:03:282216] Info : Secondary CPC v4.3.1
[07:28:03:287313] Info : Secondary bus bitrate is 230400
[07:28:03:297293] Info : Secondary APP vUNDEFINED
[07:28:03:297758] Info : Daemon startup was successful. Waiting for client connections
[07:28:04] INFO: Starting zigbeed...
[07:28:05:469119] Info : New client connection using library v4.3.1.0
[07:28:05:475796] Info : Opened connection socket for ep#12
[07:28:05:476362] Info : Endpoint socket #12: Client connected. 1 connections
[07:28:07] INFO: Setup OTBR firewall...
[07:28:07] INFO: Starting otbr-agent...
otbr-agent[328]: [NOTE]-AGENT---: Running 0.3.0
otbr-agent[328]: [NOTE]-AGENT---: Thread version: 1.3.0
otbr-agent[328]: [NOTE]-AGENT---: Thread interface: wpan0
otbr-agent[328]: [NOTE]-AGENT---: Radio URL: spinel+cpc://cpcd_0?iid=2&iid-list=0
otbr-agent[328]: [NOTE]-ILS-----: Infra link selected: eth0
otbr-agent[328]: 55d.12:58:13.748 [C] Platform------: mCpcBusSpeed = 115200
[07:28:07:838472] Info : New client connection using library v4.3.1.0
[07:28:07:844997] Info : Endpoint socket #12: Client connected. 2 connections
Listening on port 9999 for connection...
Accepting connection.
otbr-agent[328]: 00:00:00.127 [N] RoutingManager: BR ULA prefix: fd3e:8467:1471::/48 (loaded)
otbr-agent[328]: 00:00:00.127 [N] RoutingManager: Local on-link prefix: fdff:1111:1122:22ee::/64
otbr-agent[328]: 00:00:00.252 [N] Mle-----------: Role disabled -> detached
otbr-agent[328]: 00:00:00.277 [N] Platform------: [netif] Changing interface state to up.
s6-rc: info: service otbr-agent successfully started
s6-rc: info: service otbr-agent-rest-discovery: starting
s6-rc: info: service otbr-web: starting
s6-rc: info: service otbr-web successfully started
[07:28:10] INFO: Starting otbr-web...
otbr-web[413]: [INFO]-WEB-----: Running 0.3.0
otbr-web[413]: [INFO]-WEB-----: Border router web started on wpan0
otbr-agent[328]: 00:00:00.998 [N] Mle-----------: Role detached -> router
otbr-agent[328]: 00:00:01.001 [N] Mle-----------: Partition ID 0x3eb383d7
otbr-agent[328]: [NOTE]-BBA-----: BackboneAgent: Backbone Router becomes Primary!
otbr-agent[328]: 00:00:01.621 [W] Platform------: [netif] Failed to process request#9: Unknown error -95
otbr-agent[328]: 00:00:01.625 [W] Platform------: [netif] Failed to process request#10: Unknown error -95
Accepted connection 7.
otbr-agent[328]: 00:00:03.581 [N] MeshForwarder-: Dropping IPv6 UDP msg, len:138, chksum:14c7, ecn:no, sec:no, prio:net
otbr-agent[328]: 00:00:03.582 [N] MeshForwarder-: src:[fe80:0:0:0:9847:1086:d2e0:5228]:19788
otbr-agent[328]: 00:00:03.582 [N] MeshForwarder-: dst:[ff02:0:0:0:0:0:0:1]:19788
[07:28:15] INFO: Successfully sent discovery information to Home Assistant.
s6-rc: info: service otbr-agent-rest-discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
otbr-agent[328]: 00:00:07.976 [W] Platform------: radio tx timeout
otbr-agent[328]: 00:00:07.976 [W] Platform------: RCP failure detected
otbr-agent[328]: 00:00:07.977 [W] Platform------: Trying to recover (1/100)
otbr-agent[328]: 00:00:09.806 [N] Platform------: RCP recovery is done
otbr-agent[328]: 00:00:09.883 [N] MeshForwarder-: Dropping IPv6 UDP msg, len:138, chksum:929c, ecn:no, sec:no, prio:net
otbr-agent[328]: 00:00:09.884 [N] MeshForwarder-: src:[fe80:0:0:0:9847:1086:d2e0:5228]:19788
otbr-agent[328]: 00:00:09.885 [N] MeshForwarder-: dst:[ff02:0:0:0:0:0:0:1]:19788
otbr-agent[328]: 00:00:10.002 [W] Platform------: [netif] Failed to process request#11: Unknown error -17
otbr-agent[328]: 00:00:12.133 [W] Platform------: [netif] Failed to process request#12: Unknown error -17
otbr-agent[328]: 00:00:13.926 [N] MeshForwarder-: Dropping (reassembly queue) IPv6 UDP msg, len:171, chksum:6882, ecn:no, sec:yes, error:ReassemblyTimeout, prio:net, rss:-50.0
otbr-agent[328]: 00:00:13.928 [N] MeshForwarder-: src:[fe80:0:0:0:bcf1:e126:7fd8:ce92]:19788
otbr-agent[328]: 00:00:13.929 [N] MeshForwarder-: dst:[ff02:0:0:0:0:0:0:1]:19788
I updating the NanoPI Neo after restart of my laptop then its like doing Windows update. Shall i activating the OTBR in it and testing then i have disabling it for not braking the routing in HA that was not working OK then having more OTBR up and running ?
Edit: Bringing up one more OTBR is making HA loosing contact with the thread devices as expected so its no go.
Hi
I just wanted to provide an update that since using the "hacked" version of this addon, I have not experienced elevated cpu readings, including when a few addons have been updated
I'd be willing to close this issue but perhaps maybe a few more days or a few more updates for add-ons just to make sure but so far this is going really well. 👍 thabks @agners
On my production instance, I've noticed that the mDNS announcements for the _meshcop._udp
(OpenThread) service stopped suddenly. This made the "Silicon Labs Multiprotocol" Thread border router instance disappear from the Thread panel. Nothing unusual in the logs other than the occasional Failed to send IPv6 UDP msg, len:90, chksum:cba6, ecn:no, to:0xffff, sec:no, error:NoAck, prio:net
every couple of hours, it doesn't seem to be problem for the Thread communication as my Thread devices all still worked just fine). However, my logs went only back ~3 days or so, so maybe there was a more serious issue reported earlier and I did not notice.
So while I did not have the 100% CPU issue, this "lost discovery" is a bit concerning to me. I've restarted the add-on, and the Thread instance appears again, the DNS-SD messages are sent out again. I'll monitor it again.
Anyone else seen this?
Hi,
I don't think I've experienced this, checking my logs I do see that Failed to send IPv6 UDP msg, len:90, chksum:cba6, ecn:no, to:0xffff, sec:no, error:NoAck, prio:net
but my SkyConnect has always been listed a router in the thread panel. Unless its intermittent?
let me know if you would like me to dig deeper.
Just wanted to provide another update. Again more good news there have been more add-on updates. More time has passed and I have not experienced an elevated CPU usage since installing the hacked version of the Silicon Labs Multiprotocol addon.
As far as the CPU utilization concern, I'd consider it solved. However, I have not been able to see the issue that you've experienced so I will leave this issue up to you whether or not you want to close it or keep it open.
Thanks again @agners for your help, and please let me know how I can contribute more if needed.
Hi @agners -- just found this report after some googling while trying to understand why my border router kept disappearing from my thread panel in HA. I've been experiencing the exact same cluster of symptoms (border router isn't showing up in the thread panel, currently configured matter over thread devices continue functioning, multiprotocol addon consuming one full CPU core).
Is there any benefit to you having another person pull your custom docker image for testing? I see that you marked #3169 as ready for review a few hours ago so if there's no need for additional testing I'm happy to just wait.
Thanks!
Hi all, I just wanted to confirm the issue of the disappearing thread border router device. When it happens do you have to do any actions to get it back like reload or something like that?
Maybe the closest thing I've experienced with this is that when I do open the thread panel it takes a second or two for the border router devices to populate but they always repopulate of the times I've done it. The skyconnect was always on the thread panel.
@itpeters I just merged that commit, so it will soon appear as an add-on update.
As for disappearing Thread border router from the Thread panel: This is a different issue it seems. I first thought it was a regression of #3169 (since I was running a version with that change), but since you were running the version before (and also others reported they observed the same while running the old version), it doesn't seem to be a regression.
The disappearing Thread border router warrants another issue. If anyone catches the add-on behave like that, please post a new issue along with the add-on logs.
I am closing this issue since it is about the 100% issue which is expected to be resolved with #3169.
@agners opened #3189 with what info I have, as I'm able to reproduce the disappearing border router constantly.
I've just had the 100% CPU case in the pure OpenThread Border Router add-on. It seems that in this case the mdnsd
is really out of order:
otbr-agent[15369]: [ERR ]-MDNS----: Failed to publish service Home Assistant OpenThread Border Router #5129._meshcop._udp for mdnssd error: Timeout!
otbr-agent[15369]: [WARN]-BA------: Result of publish meshcop service Home Assistant OpenThread Border Router #5129._meshcop._udp.local: MDNS error
Is the addon using its own mDNS or its it using the OS one ? Then i remember right is htop shoeing only mDNS thread / posses and is using 100% CPU or is it one mismatch of versions from the addon and the OS version that i doing it ?
The add-on brings it's own mDNS suppport, it uses the mDNSResponder from Apple.
If htop was showing mdnsd
then that was the mDNSResponder.
Home Assistant OS itself has systemd-resolved with performs mDNS and DNS-SD operations. But we aren't really make use of that.
Home Assistant Core brings it's own mDNS implementation provided by https://github.com/python-zeroconf/python-zeroconf.
Describe the issue you are experiencing
Hello,
I just recently enabled multi-pan on my SkyConnect on HAOS/RPi and ZHA.
Home Assistant 2023.7.2 Supervisor 2023.07.1 Operating System 10.3 Frontend 20230705.1 - latest Silicon Labs Multiprotocol: 2.2.0 ZHA Open Thread Border Router Thread Matter 4.8.3
I don't actually have any thread or matter devices (I tried to install nanoleaf, but it simply would not work)I have 2 thread devices now.I do have 5 google nest hubs that are acting as a TBR of which I added my SkyConnect, (firmware 4.3) too so I have a single thread network with 6 thread border routers.
I had to change my ZHA channel from 20 to 16.
Every once in a while (almost every 2 days), I notice increased CPU usage of HA and glances tells me it's coming from the Silicon Labs Multiprotocol addon (see towards the end this CPU Plot)
The logs for the addon (or at least what I could copy and paste are full of this):
stoppinf the addon and restarting it, solves the issue
What type of installation are you running?
Home Assistant OS
Which operating system are you running on?
Home Assistant Operating System
Which add-on are you reporting an issue with?
Silicon Labs Multiprotocol
What is the version of the add-on?
2.2.0
Steps to reproduce the issue
...
System Health information
System Information
Home Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.32.1 Stage | running Available Repositories | 1337 Downloaded Repositories | 20Home Assistant Cloud
logged_in | true -- | -- subscription_expiration | August 21, 2023 at 6:00 PM relayer_connected | true relayer_region | us-east-1 remote_enabled | true remote_connected | true alexa_enabled | false google_enabled | true remote_server | us-east-1-7.ui.nabu.casa certificate_status | ready can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Home Assistant OS 10.3 -- | -- update_channel | beta supervisor_version | supervisor-2023.07.1 agent_version | 1.5.1 docker_version | 23.0.6 disk_total | 457.7 GB disk_used | 23.7 GB healthy | true supported | true board | rpi4-64 supervisor_api | ok version_api | ok installed_addons | Advanced SSH & Web Terminal (15.0.3), Home Assistant Google Drive Backup (0.111.1), File editor (5.6.0), Samba share (10.0.2), AdGuard Home (4.8.11), Z-Wave JS UI (1.14.0), Glances (0.19.2), AppDaemon (0.13.1), Grafana (9.0.2), ESPHome (2023.6.5), Whisper (0.2.1), Piper (0.1.3), SQLite Web (3.9.0), Matter Server (4.8.3), Silicon Labs Multiprotocol (2.2.0)Dashboards
dashboards | 11 -- | -- resources | 14 views | 24 mode | storageRecorder
oldest_recorder_run | July 5, 2023 at 1:12 PM -- | -- current_recorder_run | July 16, 2023 at 11:03 PM estimated_db_size | 1271.98 MiB database_engine | sqlite database_version | 3.41.2Spotify
api_endpoint_reachable | ok -- | --Anything in the Supervisor logs that might be useful for us?
No response
Anything in the add-on logs that might be useful for us?
No response
Additional information
No response