home-assistant / addons

:heavy_plus_sign: Docker add-ons for Home Assistant
https://home-assistant.io/hassio/
Apache License 2.0
1.58k stars 1.52k forks source link

Z-Wave JS failure #3234

Closed ablyes closed 11 months ago

ablyes commented 1 year ago

Describe the issue you are experiencing

Since the last version of home assitant Home Assistant 2023.9.3 Supervisor 2023.09.2 Operating System 10.5 Interface utilisateur : 20230911.0 - latest Z-Wave JS Current version: 0.1.93

My Z-Wave devices are not working. I have this error in the logs:

2023-09-27T16:30:01.702Z CNTRLR   [Node 026] ping failed: The node did not acknowledge the command (ZW0204)
2023-09-27T16:30:04.424Z DRIVER     no handlers registered!
2023-09-27T16:31:09.423Z CNTRLR   The controller is unresponsive
2023-09-27T16:31:09.427Z DRIVER   Attempting to recover unresponsive controller...
2023-09-27T16:31:09.518Z CNTRLR   The controller does not support soft reset or the soft reset feature has been 
                                  disabled with a config option or the ZWAVEJS_DISABLE_SOFT_RESET environment va
                                  riable.
2023-09-27T16:31:09.521Z DRIVER   Recovering unresponsive controller failed. Restarting the driver...
Error in driver ZWaveError: Recovering unresponsive controller failed. Restarting the driver... (ZW0100)
    at Driver.destroyWithMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:2769:17)
    at fail (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3484:14)
    at /usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3533:5
    at runNextTicks (node:internal/process/task_queues:60:5)
    at processTimers (node:internal/timers:509:9) {
  code: 100,
  context: undefined,
  transactionSource: undefined
}
Shutting down
Closing server...
2023-09-27T16:31:09.541Z CNTRLR   [Node 017] Assigning SUC return route failed: Timeout while waiting for a call
                                  back from the controller (ZW0200)
Client disconnected
Code 1000: 
Server closed
[16:31:10] WARNING: Halt add-on
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

What type of installation are you running?

Home Assistant Supervised

Which operating system are you running on?

Home Assistant Operating System

Which add-on are you reporting an issue with?

Z-Wave JS

What is the version of the add-on?

0.1.93

Steps to reproduce the issue

System Health information

Message in french : Aucune correction n'est actuellement disponible -> no correction

Anything in the Supervisor logs that might be useful for us?

No response

Anything in the add-on logs that might be useful for us?

No response

Additional information

No response

mirkin-pixel commented 1 year ago

Hmm, ok. I removed the failed node (as test) and now it works perfectly.

valorp commented 1 year ago

I'm glad to find some friends with my same issue.

Here's my logs, verrry similar but nothing I've tried seems to fix it. Was working fine until upgrade to 2023.9.3

2023-09-30 00:37:03.204 DEBUG SOCKET: Event INITED emitted to T95e3UyTCYRSy28_AAAH
2023-09-30 00:37:04.111 INFO Z-WAVE: Controller status: Controller is unresponsive
2023-09-30 00:37:04.134 INFO Z-WAVE: Controller status: Driver: Recovering unresponsive controller failed. Restarting the driver... (ZW0100)
2023-09-30 00:37:04.140 INFO Z-WAVE: Restarting client in 1 seconds, retry 1
2023-09-30 00:37:05.166 INFO Z-WAVE-SERVER: Client disconnected
2023-09-30 00:37:05.198 INFO Z-WAVE-SERVER: Server closed
2023-09-30 00:37:05.205 INFO Z-WAVE: Client closed
2023-09-30 00:37:05.242 INFO Z-WAVE: Connecting to /dev/serial/by-id/usb-Silicon_Labs_HubZ_Smart_Home_Controller_C13021C0-if00-port0
2023-09-30 00:37:05.264 INFO Z-WAVE: Setting user callbacks
Logging to file:
/data/store/logs/zwavejs_2023-09-30.log
2023-09-30 00:37:11.710 INFO Z-WAVE: Z-Wave driver is ready
2023-09-30 00:37:11.735 INFO Z-WAVE: Controller status: Driver ready

I just see this repeated over and over until the add-on crashes.

HA: 2023.9.3 ZUI: 2.0.1 USB Stick: Nortek USBZB1, running v 4.35 or similar

AlCalzone commented 1 year ago

https://github.com/zwave-js/node-zwave-js/issues/6341

cadwizzard commented 1 year ago

I've reverted to a backup now (everything working perfectly again), but when I saw this behaviour (aeotec 5+ stick), I also tried replugging the stick, & disabling soft reset in jsui. No change. I do have one known dead node on my network also, but it's previously never caused an issue and been that way for months (it's a Yale lock on an external remote gate which is a pain to remove. And if not removed with network exclusion, it's a known issue they can't be rejoined again, hence there and not deleted from HA for now).

indi81 commented 1 year ago

Same issue here. Aeotec Z-Stick Gen5 with FW 6.07 (Gen5+). Started happening sometime last week.

Fixed this for now by moving over to Z-Wave JS UI supervisor and disabling soft-reset. Also are seeing multiple reset issues with the Zigbee adapter so my guess is that something is bad in the core here.

https://www.home-assistant.io/integrations/zwave_js/#how-do-i-switch-between-the-official-z-wave-js-add-on-and-the-z-wave-js-ui-add-on

_1. How do I switch between the Official Z-Wave JS add-on and the Z-Wave JS UI add-on? Switching does not require renaming your devices.

  1. Disable the Z-Wave integration. Do not remove the Z-Wave integration or you will lose all device and entity naming. This will automatically stop the official Z-Wave JS add-on.
  2. Note your network security keys from the official add-on.
  3. Install and configure the Z-Wave JS UI add-on, including setting the location of your Z-Wave device and the network security keys.
  4. Add the Z-Wave integration again (even though it is still installed), and uncheck the “Use the Z-Wave JS Supervisor add-on”. Enter the correct address for the community add-on in the URL field in the next step.
  5. Uninstall the official Z-Wave JS add-on.
  6. Enable the Z-Wave integration._
psychogun commented 1 year ago

I am experiencing the same issue; I was running the official Z-Wave JS. Running HA from a XCP-NG installation. I have the AEON Labs Z-Stick Gen5 USB Controller. Figured I had to go the Route and install Z-Wave JS UI add-on instead, as it gave me some hope that disabling Soft-reset would do the trick. It did not. I then updated the firmware on the stick, as that was also reported to work. The stick now reports: FW: v1.2 SDK: v6.81.6, however, I am experiencing the same issue.

Driver ready. Error: Driver: Recovering unresponsive controller failed. Restartin the driver.... (ZW0100)

Why is it saying ZW0100, when the Product code is listed as ZW090? It is the Aeotec Gen5, I think, not the +?

AEON Labs | Z‐Stick Gen5 USB Controller | ZW090

AlCalzone commented 1 year ago

@psychogun I'll need more complete driver logs (level debug) then.

Why is it saying ZW0100

That's Z-Wave JS's error code, has nothing to do with the device label.

bgarderhagen commented 1 year ago

What is the best way to remove a z-wave device from the network, now that it's this unstable? I would like to wait for a future patch and leave everything on current versions, but i need to move my front door from Z-Wave module in my ID Lock to a Zigbee module i have availible.

Delete the device in Home assistant, move the Z-Wave dongle to a different computer and make a controlled remove from the network configuration kept in the dongle ?

What do any of you suggest ?

indi81 commented 1 year ago

@psychogun

I am experiencing the same issue; I was running the official Z-Wave JS. Running HA from a XCP-NG installation. I have the AEON Labs Z-Stick Gen5 USB Controller. Figured I had to go the Route and install Z-Wave JS UI add-on instead, as it gave me some hope that disabling Soft-reset would do the trick. It did not. I then updated the firmware on the stick, as that was also reported to work. The stick now reports: FW: v1.2 SDK: v6.81.6, however, I am experiencing the same issue.

Driver ready. Error: Driver: Recovering unresponsive controller failed. Restartin the driver.... (ZW0100)

Why is it saying ZW0100, when the Product code is listed as ZW090? It is the Aeotec Gen5, I think, not the +?

AEON Labs | Z‐Stick Gen5 USB Controller | ZW090

Did you also follow the switchover instructions? You have to disable the core ZWave JS server and use the one that comes with ZWave JS UI. Having them both active and contending for the serial port doesn't help.

psychogun commented 1 year ago

I am experiencing the same issue; I was running the official Z-Wave JS. Running HA from a XCP-NG installation. I have the AEON Labs Z-Stick Gen5 USB Controller. Figured I had to go the Route and install Z-Wave JS UI add-on instead, as it gave me some hope that disabling Soft-reset would do the trick. It did not. I then updated the firmware on the stick, as that was also reported to work. The stick now reports: FW: v1.2 SDK: v6.81.6, however, I am experiencing the same issue. Driver ready. Error: Driver: Recovering unresponsive controller failed. Restartin the driver.... (ZW0100) Why is it saying ZW0100, when the Product code is listed as ZW090? It is the Aeotec Gen5, I think, not the +? AEON Labs | Z‐Stick Gen5 USB Controller | ZW090

Did you also follow the switchover instructions? You have to disable the core ZWave JS server and use the one that comes with ZWave JS UI. Having them both active and contending for the serial port doesn't help.

Yes; I disabled ZWAVE JS etc. by following the instructions.

Everything is back to normal now, as I:

Reverting the addon_core_zwave_js back to 0.1.89 fixed it for me, atleast.

Home Assistant 2023.9.3
Supervisor 2023.09.2
Operating System 10.5
Frontend 20230911.0 - latest
valorp commented 1 year ago

zwave-js/node-zwave-js#6341

Thanks for the link. I tried all of those things with no luck.

The only thing that resolved was restore from backup to Core 2023.08 and ZWaveJS v1.87.

AlCalzone commented 1 year ago

I tried all of those things with no luck.

Got driver logs?

hedrickbt commented 1 year ago

I had the same issues with the same versions and original gen5 z-stick.

The only thing that worked for me.

  1. Updated z-stick firmware
  2. Disable zwave js add on
  3. Install zwave js ui add on - transfer configuration by hand from zwave js
  4. Point zwave integration to zwave js ui host and port.

As a bonus I have had issues when restarting/upgrading that I would have to reinterview about half of my devices (maybe 12) every single time. Zwave js ui doesn't seem to have that issue. Also much more full featured.

Spazpeker commented 1 year ago

My system was totally unusable, managed to update my Z stick 7 to FW: v7.19.4 got the firmware off Silicon Labs Github seems to be working now

Quarco commented 1 year ago

Struggling here getting my ZWave network up and running again. Started since last update. Update to 9.0.1 didn't change a thing. Using a Aeotec Gen5 z-stick. Ordered a Gen7 stick, but just read the comment of @Spazpeker, so i'm afraid i'll be living in hell the comming weeks getting everything back to normal again :(

Spazpeker commented 1 year ago

https://github.com/SiliconLabs/gecko_sdk/tree/gsdk_4.2/protocol/z-wave/Apps/bin/gbl for version 7.19.4 mine didn't brick but yours might

image

akajester commented 1 year ago

I had the same issues with the same versions and original gen5 z-stick.

The only thing that worked for me.

  1. Updated z-stick firmware
  2. Disable zwave js add on
  3. Install zwave js ui add on - transfer configuration by hand from zwave js
  4. Point zwave integration to zwave js ui host and port.

As a bonus I have had issues when restarting/upgrading that I would have to reinterview about half of my devices (maybe 12) every single time. Zwave js ui doesn't seem to have that issue. Also much more full featured.

Can you fill in the blanks on how to do home-assistant/core#3 (manual config transfer) and home-assistant/core#4? I'm running HASSIO. Every time I update zwave stops working now. I did the firmware upgrade on my gen 5 stick, but I doubt that alone will fix it. Thanks. Home Assistant 2023.5.4 Supervisor 2023.09.2 Operating System 10.5 Frontend 20230503.3 - latest

valorp commented 1 year ago

Can you fill in the blanks on how to do home-assistant/core#3 (manual config transfer)

This guide takes care of all that for you https://community.home-assistant.io/t/switching-z-wave-js-addons-with-minimal-downtime-z-wave-js-official-to-z-wave-js-ui-community/409904

FWIW, ZWave JS 1.9x and ZWave JS UI v2.0.1 both have the same issue for me. I had to revert to HA2023.8 and ZWaveJS 1.87

valorp commented 1 year ago

I tried all of those things with no luck.

Got driver logs?

Yes, attached. HA 2023.9 + ZUI 2.0.1--FULL SILLY.txt HA 2023.9 + ZUI 2.0.1.txt

hedrickbt commented 1 year ago

home-assistant/core#3 You will want to stop zwave js and install zwave js ui add on .  grab the security keys from zwave js settings and copy to zwave js ui.  Make sure the zwave device is selected. Now make sure zwave js ui can start.#4 under devices and services, find zwave.  You will want to configure, uncheck use the supervisor ui, and update the ws://hostname:3000 to use the hostname listed in the zwave js ui configurationSent from my iPhoneOn Oct 2, 2023, at 11:23 AM, akajester @.***> wrote:

I had the same issues with the same versions and original gen5 z-stick. The only thing that worked for me.

Updated z-stick firmware Disable zwave js add on Install zwave js ui add on - transfer configuration by hand from zwave js Point zwave integration to zwave js ui host and port.

As a bonus I have had issues when restarting/upgrading that I would have to reinterview about half of my devices (maybe 12) every single time. Zwave js ui doesn't seem to have that issue. Also much more full featured.

Can you fill in the blanks on how to do home-assistant/core#3 and home-assistant/core#4? I'm running HASSIO. Every time I update zwave stops working now. I did the firmware upgrade on my gen 5 stick, but I doubt that alone will fix it. Thanks. Home Assistant 2023.5.4 Supervisor 2023.09.2 Operating System 10.5 Frontend 20230503.3 - latest

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

tboudri commented 1 year ago

I am running HASOS 10.5 in UTM VM with supervisor 2023.09.3. When i try to install latest version of zwave JS (version 0.1.93) the VM crashes. After rebooting there is nothing in the log. When the VM crashes i get following message:

QEMU error: QEMU exited from an error: Assertion failed: (p->stream || QTAILQ_FIRST(&ep->queue)==p), function usb_packet_complete_one, file core.c, line 470. I use Aeotec gen5+ usb stick. When the usb stick is not connected the upgrade succeeds but as soon as i try to use the zwave stick the VM crashes. I have tried this many times. Even with older VM backups but always with the same result.

thenoid commented 1 year ago

seems i'm being bitten by this same bug, trying to decipher which docker container to roll back too. Tried to update my firmware, but windows 11 pro doesn't seem compatible with the windows drivers :(

ablyes commented 1 year ago

The last version 0.1.94 doesn’t resolve the issue.

psychogun commented 1 year ago

The last version 0.1.94 doesn’t resolve the issue.

That is my experience, as well!

tboudri commented 1 year ago

No, it still does not work. I'am not sure if the problem is zwave js. It could als be QEMU related. I have tried installing HASSOS on parallels Desktop. Although the VM dit not crash, i found USB not very reliable on Parallels maybe that's because i was not able to install parallels tools.

kpine commented 1 year ago

The last version 0.1.94 doesn’t resolve the issue.

Did you disable soft-reset? If so, can you provide driver logs of the error?

mariuszlabedzki commented 1 year ago

I had the same issues, tried everything but only after firmware upgrade everything started working (Aotec stick)

ablyes commented 1 year ago

I'll try to provide the debug logs soon. I don't want to upgrade the stick. Aoetec are clowns, they release an update, and say that if you brick the key, it's for you, christmas gift. I'll keep the version 0.1.90 for a while i think.

AlCalzone commented 1 year ago

@valorp

Doesn't look like soft-reset is your problem. Your stick is just slow. Z-Wave JS tries to assign a route from a node back to the controller. This is a 3-step process:

  1. ZJS -> Controller: Assign a route
  2. Controller -> ZJS: "I started"
  3. later: Controller -> ZJS: "I'm done"

From 1. to 2. normally takes in the order of <500 ms, often much less (~10 ms), 2. to 3. is allowed to take up to a minute. In your case, 1. to 2. takes roughly 15 seconds (!), but Z-Wave JS thinks the controller is unresponsive after 10 seconds and tries to restart it.

This timeout is configurable (up to 20s), but I'm not sure HA exposes a way to do this.

tboudri commented 1 year ago

This is getting a serious problem. I upgraded today to te latest core and supervisor. after upgrading, i was not able to use home assistant with zwave js. Althoug zwave js is still at version 0.1.90. UTM crashes immediately

ablyes commented 1 year ago

Now that i moved from jeedom... i'm afraid and don't want to update/upgrade my installation. Zwave is the heart of my installation. I can't trust any update now.

valorp commented 1 year ago

@AlCalzone -- many thanks. This has never been a problem, and v1.87 of the official HA ZWaveJS add-on works fine. I just ordered a new Zooz Z-Wave stick, so we'll see if that resolves things.

Thanks again.

tboudri commented 1 year ago

@ablyes,

Make a copy of your VM before updating. (do not clone the VM because that's just a reference)

cadwizzard commented 1 year ago

After trying all sorts (including soft reset disable, stick unplugging and updating to JS UI 2.0.2, I'm now pretty convinced that if you have a dead node on your network, you can still suffer driver restarts constantly. https://community.home-assistant.io/t/upgrades-today-have-caused-constant-unresponsive-warnings-resulting-in-repetitive-driver-restarts-and-z-wave-interruptions/619424/75?u=cadwizzard

irgendwer112 commented 1 year ago

After trying all sorts (including soft reset disable, stick unplugging and updating to JS UI 2.0.2, I'm now pretty convinced that if you have a dead node on your network, you can still suffer driver restarts constantly. https://community.home-assistant.io/t/upgrades-today-have-caused-constant-unresponsive-warnings-resulting-in-repetitive-driver-restarts-and-z-wave-interruptions/619424/75?u=cadwizzard

Agree. Problems occur after recognizing a dead node. And even with 0.1.90 now the error occurs. With HA core_2023.9.3 Z-Wave JS 0.1.90 is still working.

tboudri commented 1 year ago

I also had some dead nodes. But after removing these nodes UTM stil crashes after upgrading to core 2023-10.0 or zwave js 0.1.94 Another thing i tried wasI have a second aeotec gen5+ stick as backup. I have reset this to factory default and tried to add this to Home Assistant as a new device. But then the VM crashes immediately. Even with the configuration that was working with my regular Aeotec gen5+ stick.

I don't think this is to blame UTM. Because i also tried paralells desktop and found that worked not very reliable. Infact the problem was the same although the VM did not crash. But as sone as i tried to use the zwave stick, all usb devices that here connected to the VM disapeared.

AlCalzone commented 1 year ago

@valorp We're adding a configuration option to the zwave addon, which allows you to increase that timeout and work around the issue.

@cadwizzard Do you have a driver log of that behavior on loglevel debug?

cadwizzard commented 1 year ago

@cadwizzard Do you have a driver log of that behavior on loglevel debug?

Sorry, i dont have a debug log, only what i posted here, but i have the full log at that level still, it was just too large to post https://community.home-assistant.io/t/upgrades-today-have-caused-constant-unresponsive-warnings-resulting-in-repetitive-driver-restarts-and-z-wave-interruptions/619424/75?u=cadwizzard

This is the log from pre softreset disable and version 2.0.1. It looks like there are maybe 2 seconds elapsed between everything fine and the driver gets restarted. Behaviour never occurs pre 2.0.0:

2023-10-04T18:28:45.516Z CNTRLR « [Node 024] ping successful
2023-10-04T18:28:48.666Z CNTRLR « [Node 004] ping successful
2023-10-04T18:28:51.092Z CNTRLR « [Node 113] ping successful
2023-10-04T18:28:51.477Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2023-10-04T18:29:00.207Z DRIVER     no handlers registered!
2023-10-04T18:30:05.216Z CNTRLR   The controller is unresponsive
2023-10-04T18:30:05.221Z DRIVER   Attempting to recover unresponsive controller...
2023-10-04T18:30:05.292Z CNTRLR   Performing soft reset...
2023-10-04T18:30:05.324Z CNTRLR   Waiting for the controller to reconnect...
2023-10-04T18:30:06.825Z CNTRLR   Re-opening serial port...
2023-10-04T18:30:07.832Z CNTRLR   Waiting for the Serial API to start...
2023-10-04T18:30:08.201Z CNTRLR   Serial API started
2023-10-04T18:30:08.202Z CNTRLR   The controller is no longer unresponsive
2023-10-04T18:30:08.518Z DRIVER     no handlers registered!
2023-10-04T18:31:13.518Z DRIVER   Controller is still timing out. Restarting the driver...
2023-10-04T18:31:13.528Z CNTRLR   [Node 014] Assigning SUC return route failed: Timeout while waiting for a call
                                  back from the controller (ZW0200)
2023-10-04T18:31:14.611Z DRIVER   ███████╗ ██╗    ██╗  █████╗  ██╗   ██╗ ███████╗             ██╗ ███████╗
                                  ╚══███╔╝ ██║    ██║ ██╔══██╗ ██║   ██║ ██╔════╝             ██║ ██╔════╝
                                    ███╔╝  ██║ █╗ ██║ ███████║ ██║   ██║ █████╗   █████╗      ██║ ███████╗
                                   ███╔╝   ██║███╗██║ ██╔══██║ ╚██╗ ██╔╝ ██╔══╝   ╚════╝ ██   ██║ ╚════██║
                                  ███████╗ ╚███╔███╔╝ ██║  ██║  ╚████╔╝  ███████╗        ╚█████╔╝ ███████║
                                  ╚══════╝  ╚══╝╚══╝  ╚═╝  ╚═╝   ╚═══╝   ╚══════╝         ╚════╝  ╚══════╝
2023-10-04T18:31:14.612Z DRIVER   version 12.0.0
AlCalzone commented 1 year ago

@cadwizzard which controller do you have? Aeotec Gen5? It looks like you're running into this firmware bug where the controller responds with the wrong command, triggering Z-Wave JS's new "unresponsive controller" detection.

cadwizzard commented 1 year ago

zstick

Exactly, Gen5+

Is there currently some mitigation plan in mind for this?

AlCalzone commented 1 year ago

Yeah I'm working on it. Hope to get the fix released today or tomorrow depending on how long the kids sleep later 😇

cadwizzard commented 1 year ago

Yeah I'm working on it. Hope to get the fix released today or tomorrow depending on how long the kids sleep later 😇

Brilliant. Thought you might be ;) Once released and installed i'll report back here quickly

(Massive thanks for the work you have put into z wave js btw. I moved over from mi casa verde (Vera), and wouldn't have without z-wave support. 95% of my smarthome is z wave).

Two things, not everyone with the same controller seems affected by this bug, so wonder whats different (except i have a dead node): https://community.home-assistant.io/t/upgrades-today-have-caused-constant-unresponsive-warnings-resulting-in-repetitive-driver-restarts-and-z-wave-interruptions/619424/79?u=cadwizzard

And I wonder if this firmware glitch is at all related to this too (because its a glitch somewhere not in ZJS or ZUI: https://community.home-assistant.io/t/smartstart-creating-multiple-nodes-but-not-including-properly/622100/2

AlCalzone commented 1 year ago

Two things, not everyone with the same controller seems affected by this bug, so wonder whats different (except i have a dead node):

So far I've only seen this happen when the command fails, so the dead node might be the reason.

cadwizzard commented 1 year ago

That would tie in with people rebuilding their networks from scratch and the problem going away, despite everything else hardware wise being the same. All well and good with a few nodes..... not so much with around 60, and many hours of automations built around them :D

sptr112 commented 1 year ago

Same issue with a dead node causing reboot loop. Using a RazBerry 2 (unclear on exact model) hat for a pie3b. Error started when upgrading to latest version of Home Assistant.

Home Assistant 2023.9.3 Supervisor 2023.10.0 Operating System 10.5 Frontend 20230911.0 - latest

Driver version: 12.0.2 Server version: 1.32.1 Z-Wave JS: 0.1.94

2023-10-05T15:35:49.173Z CNTRLR « [Node 015] ping successful New client 2023-10-05T15:35:59.894Z CNTRLR [Node 014] The node did not respond after 1 attempts, it is presumed dead 2023-10-05T15:35:59.898Z CNTRLR [Node 014] The node is dead. 2023-10-05T15:35:59.899Z CNTRLR All nodes are ready to be used 2023-10-05T15:35:59.912Z CNTRLR [Node 014] ping failed: The node did not acknowledge the command (ZW0204) 2023-10-05T15:36:06.616Z DRIVER no handlers registered! 2023-10-05T15:37:11.620Z CNTRLR The controller is unresponsive 2023-10-05T15:37:11.627Z DRIVER Attempting to recover unresponsive controller... 2023-10-05T15:37:11.734Z CNTRLR Performing soft reset... 2023-10-05T15:37:11.759Z CNTRLR Waiting for the controller to reconnect... 2023-10-05T15:37:13.263Z CNTRLR Waiting for the Serial API to start... 2023-10-05T15:37:18.265Z CNTRLR Did not receive notification that Serial API has started, checking if it respo nds... 2023-10-05T15:37:18.294Z CNTRLR Serial API responded 2023-10-05T15:37:18.295Z CNTRLR The controller is no longer unresponsive 2023-10-05T15:37:18.315Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T15:37:19.551Z DRIVER no handlers registered! 2023-10-05T15:38:24.549Z DRIVER Controller is still timing out. Restarting the driver... Error in driver ZWaveError: Controller is still timing out. Restarting the driver... (ZW0100) at Driver.destroyWithMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:2769:17) at fail (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3484:14) at Driver.handleUnresponsiveController (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3493:4) at Driver.handleFailedTransaction (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:5521:13) at Driver.drainTransactionQueue (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4613:10) { code: 100, context: undefined, transactionSource: undefined } Shutting down Closing server... 2023-10-05T15:38:24.589Z CNTRLR [Node 018] Assigning SUC return route failed: Timeout while waiting for a call back from the controller (ZW0200) Client disconnected Code 1000: Server closed [15:38:24] WARNING: Halt add-on

cadwizzard commented 1 year ago

It looks like you have soft reset still enabled. That resolved it for some...... but not everyone (including me), I see you also have a dead node notification.

On the plus side, I have way more confidence in backup restores than I would have had previously :)

sptr112 commented 1 year ago

Tried disabling soft reset, didnt solve the problem.

mattster98 commented 1 year ago

the dead node might be the reason

I'm beginning to think that's the main issue.. but why was it OK before and not now? I've rolled back to 0.1.90 since it is still working in general. Tried newer versions with soft reset disabled with no difference.

Additionally, I've got two nodes that are unreachable (not even sure what they are/were) but I cannot remove them.. so I'm a bit stuck if that's part of the solution. They both result in a Timeout waiting for an ACK from the controller. It seems like it tries to ping the node I'm telling it is dead/unreachable but it tries to ping it anyway and that leads to the errors. One of the nodes is status: "unknown", the other is "alive" but clearly not.. I'm not sure how it's coming to that conclusion.

Here's node 21 (the unknown one): 2023-10-05T15:56:43.549Z CNTRLR » [Node 021] pinging the node... 2023-10-05T15:56:53.561Z CNTRLR No response from controller after 1/3 attempts. Scheduling next try in 100 ms. 2023-10-05T15:56:54.667Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T15:56:56.775Z CNTRLR [Node 021] ping failed: Timeout while waiting for an ACK from the controller ( ZW0200) 2023-10-05T15:56:57.784Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T15:56:58.889Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. Z-Wave error ZWaveError: Timeout while waiting for an ACK from the controller (ZW0200) at Driver.sendMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4889:23) at ZWaveController.removeFailedNodeInternal (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5372:36) at ZWaveController.removeFailedNode (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5339:3) at Function.handle (/usr/src/node_modules/@zwave-js/server/dist/lib/controller/message_handler.js:72:17) at Client.receiveMessage (/usr/src/node_modules/@zwave-js/server/dist/lib/server.js:119:62) { code: 200, context: 'ACK', transactionSource: ' at Driver.sendMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4889:23)\n' + ' at ZWaveController.removeFailedNodeInternal (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5372:36)\n' + ' at ZWaveController.removeFailedNode (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5339:3)\n' + ' at Function.handle (/usr/src/node_modules/@zwave-js/server/dist/lib/controller/message_handler.js:72:17)\n' + ' at Client.receiveMessage (/usr/src/node_modules/@zwave-js/server/dist/lib/server.js:119:62)' } 2023-10-05T15:58:55.285Z DRIVER unexpected response, discarding... 2023-10-05T15:58:59.407Z DRIVER no handlers registered!

Here's node 33, the allegedly alive node.. I tried pinging and interview and it makes no progress:

2023-10-05T16:00:42.676Z CNTRLR » [Node 033] pinging the node... 2023-10-05T16:00:52.691Z CNTRLR No response from controller after 1/3 attempts. Scheduling next try in 100 ms. 2023-10-05T16:00:53.797Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T16:00:55.904Z CNTRLR [Node 033] ping failed: Timeout while waiting for an ACK from the controller ( ZW0200) 2023-10-05T16:01:25.966Z CNTRLR » [Node 038] Meter CC values may be stale, refreshing... 2023-10-05T16:01:25.968Z CNTRLR » [Node 038] querying meter value (type = Electric, scale = kWh)... 2023-10-05T16:01:26.980Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T16:01:28.087Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T16:01:31.207Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T16:01:32.313Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T16:01:32.605Z CNTRLR [Node 033] Beginning interview - last completed stage: None 2023-10-05T16:01:32.605Z CNTRLR [Node 033] new node, doing a full interview... 2023-10-05T16:01:32.607Z CNTRLR » [Node 033] querying protocol info... 2023-10-05T16:01:35.434Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T16:01:36.544Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T16:01:38.651Z CNTRLR [Node 038] failed to refresh values for Meter CC: Timeout while waiting for an ACK from the controller (ZW0200) 2023-10-05T16:01:39.661Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T16:01:40.767Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T16:01:42.873Z CNTRLR [Node 033] Error during node interview: Timeout while waiting for an ACK from the controller (ZW0200) 2023-10-05T16:02:54.412Z DRIVER unexpected response, discarding... 2023-10-05T16:02:58.531Z DRIVER no handlers registered!

..then tried to remove dead node:

2023-10-05T16:08:05.861Z CNTRLR » [Node 033] pinging the node... 2023-10-05T16:08:15.877Z CNTRLR No response from controller after 1/3 attempts. Scheduling next try in 100 ms. 2023-10-05T16:08:16.983Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. 2023-10-05T16:08:19.091Z CNTRLR [Node 033] ping failed: Timeout while waiting for an ACK from the controller ( ZW0200) 2023-10-05T16:08:20.102Z CNTRLR Failed to execute controller command after 1/3 attempts. Scheduling next try i n 100 ms. 2023-10-05T16:08:21.209Z CNTRLR Failed to execute controller command after 2/3 attempts. Scheduling next try i n 1100 ms. Z-Wave error ZWaveError: Timeout while waiting for an ACK from the controller (ZW0200) at Driver.sendMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4889:23) at ZWaveController.removeFailedNodeInternal (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5372:36) at runNextTicks (node:internal/process/task_queues:60:5) at processTimers (node:internal/timers:509:9) at ZWaveController.removeFailedNode (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5339:3) at Function.handle (/usr/src/node_modules/@zwave-js/server/dist/lib/controller/message_handler.js:72:17) at Client.receiveMessage (/usr/src/node_modules/@zwave-js/server/dist/lib/server.js:119:62) { code: 200, context: 'ACK', transactionSource: ' at Driver.sendMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4889:23)\n' + ' at ZWaveController.removeFailedNodeInternal (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5372:36)\n' + ' at runNextTicks (node:internal/process/task_queues:60:5)\n' + ' at processTimers (node:internal/timers:509:9)\n' + ' at ZWaveController.removeFailedNode (/usr/src/node_modules/zwave-js/src/lib/controller/Controller.ts:5339:3)\n' + ' at Function.handle (/usr/src/node_modules/@zwave-js/server/dist/lib/controller/message_handler.js:72:17)\n' + ' at Client.receiveMessage (/usr/src/node_modules/@zwave-js/server/dist/lib/server.js:119:62)' }

mattster98 commented 1 year ago

Well, fudge. I've been trying to find updated firmware for my Homeseer SmartStick+ G2 and found a windows tool for that that I was hoping would also let me manually delete the bad nodes from the controller. Unfortunately it has a "reset" button I tried with no tooltip or confirmation that I assumed was a soft reset. It was not. I'll just be over here rebuilding my network now. Don't mind me. 😢

cadwizzard commented 1 year ago

Maybe you manually created an NVM backup at some point you can restore?