zwave-js / node-zwave-js

Z-Wave driver written entirely in JavaScript/TypeScript
https://zwave-js.github.io/node-zwave-js/
MIT License
750 stars 599 forks source link

Nodes recently becoming unavailable repeatedly #6354

Closed Maes152 closed 1 year ago

Maes152 commented 1 year ago

Is your problem within Home Assistant (Core or Z-Wave JS Integration)?

NO, my problem is NOT within Home Assistant or the ZWave JS integration

Is your problem within Z-Wave JS UI (formerly ZwaveJS2MQTT)?

NO, my problem is NOT within Z-Wave JS UI

Checklist

Describe the bug

For a while now, although I am not sure since what zwavejs version update, I am finding some of my nodes unavailable. This presumably happens after interacting with them. I'm not completely sure since 99% of the interactions go through automations and I am having trouble replicating this behavior manually. For this reason I'm not able to provide accurate logs at this time. I've enabled the logging to file and will upload these logs the moment I find another node unavailable.

I have been using this exact setup for a couple of years and have never had any issues with it. It seems to me that the zwavejs driver is perhaps a little bit more "aggressive" in marking a node as unavailable in case a packet isn't received on time. The only way for me to get the node back online is to re-interview it twice.

I realize this is not a lot of information to go on but I was wondering whether this seems like familiar behavior to anyone and if yes what the recommended course of action would be?

Device information

Manufacturer: Aeotec Model name: Z-Stick Gen5 ZW090 / FW1.2 Node ID in your network: Multiple

How are you using node-zwave-js?

Which branches or versions?

version: node-zwave-js branch: 12.0.0 / HA Addon ZwaveJS2MQTT zwave-js-ui branch: 9.0.1 / HA Addon ZwaveJS2MQTT

Did you change anything?

no

If yes, what did you change?

No response

Did this work before?

Yes (please describe)

If yes, where did it work?

No response

Attach Driver Logfile

will provide later

zwave-js-bot commented 1 year ago

👋 Hey @Maes152!

Thanks for opening an issue! It doesn't look like you provided a logfile though. While not strictly necessary for every issue, having a driver log is required to diagnose most issues.

Please consider uploading a logfile that captures your problem.

niemyjski commented 1 year ago

Same. Same Controller as well... seems like nodes are constantly marked dead and then come right back

The node is alive.
2023-10-02T10:34:32.687Z CNTRLR   [Node 039] The node is ready to be used
2023-10-02T10:34:32.704Z CNTRLR « [Node 039] ping successful
2023-10-02T10:34:35.166Z CNTRLR   [Node 002] The node is alive.
2023-10-02T10:34:35.167Z CNTRLR   [Node 002] The node is ready to be used
2023-10-02T10:34:35.172Z CNTRLR « [Node 002] ping successful
2023-10-02T10:34:44.069Z CNTRLR   [Node 068] The node is alive.
2023-10-02T10:34:44.070Z CNTRLR   [Node 068] The node is ready to be used
2023-10-02T10:34:44.074Z CNTRLR « [Node 068] ping successful
2023-10-02T10:34:44.494Z CNTRLR   [Node 054] The node is alive.
2023-10-02T10:34:44.495Z CNTRLR « [Node 054] ping successful
2023-10-02T10:34:44.497Z CNTRLR   [Node 054] Interviewing Version...
2023-10-02T10:34:44.497Z CNTRLR » [Node 054]   querying the CC version for Version...
2023-10-02T10:35:14.922Z DRIVER     no handlers registered!
2023-10-02T10:35:18.251Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-02T10:35:21.550Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-02T10:36:19.922Z CNTRLR   The controller is unresponsive
2023-10-02T10:36:19.922Z DRIVER   Attempting to recover unresponsive controller...
2023-10-02T10:36:19.925Z CNTRLR   Performing soft reset...
2023-10-02T10:36:20.186Z CNTRLR   Waiting for the controller to reconnect...
2023-10-02T10:36:21.688Z CNTRLR   Re-opening serial port...
2023-10-02T10:36:22.692Z CNTRLR   Waiting for the Serial API to start...
2023-10-02T10:36:23.009Z CNTRLR   Serial API started
2023-10-02T10:36:23.009Z CNTRLR   The controller is no longer unresponsive
2023-10-02T10:36:31.274Z DRIVER     no handlers registered!
niemyjski commented 1 year ago

@Maes152 can you provide your logs. @AlCalzone this started after upgrade to 12

Maes152 commented 1 year ago

Unfortunately it hasn't happened since I turned on logging.... will provide the moment it does!

AlCalzone commented 1 year ago

Also make sure the logs are on loglevel debug. @niemyjski same for you.

niemyjski commented 1 year ago

@AlCalzone I just updated to latest release still same issues...

2023-10-04T13:52:14.908Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:52:14.909Z CNTRLR   [Node 078] [Meter] value[66049]: metadata updated                 [Endpoint 0]
2023-10-04T13:52:14.909Z CNTRLR   [Node 078] [~] [Meter] value[66049]: 0 => 0                       [Endpoint 0]
2023-10-04T13:52:14.910Z DRIVER « [Node 078] [REQ] [ApplicationCommand]
                                  └─[MeterCCReport]
                                      type:       Electric
                                      scale:      W
                                      rate type:  Consumed
                                      value:      0
                                      time delta: 0 seconds
2023-10-04T13:52:22.951Z SERIAL » 0x011100134d0a9880dccfb3f294cbfcf10542e5                            (19 bytes)
2023-10-04T13:52:22.952Z DRIVER » [Node 077] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      66
                                  └─[SecurityCCNonceReport]
                                      nonce: 0xdccfb3f294cbfcf1
2023-10-04T13:52:22.952Z DRIVER   Controller is still timing out. Restarting the driver...
Error in driver ZWaveError: Controller is still timing out. Restarting the driver... (ZW0100)
    at Driver.destroyWithMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:2769:17)
    at fail (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3484:14)
    at Driver.handleUnresponsiveController (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3493:4)
    at Driver.handleFailedTransaction (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:5521:13)
    at Driver.drainTransactionQueue (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4613:10) {
  code: 100,
  context: undefined,
  transactionSource: undefined
}
Shutting down
Closing server...
2023-10-04T13:52:22.959Z DRIVER   destroying driver instance...
2023-10-04T13:52:22.960Z CNTRLR   [Node 089] Assigning SUC return route failed: Timeout while waiting for a call
                                  back from the controller (ZW0200)
Client disconnected
Code 1000: 
Server closed
2023-10-04T13:52:23.037Z DRIVER   driver instance destroyed
[13:52:23] WARNING: Halt add-on
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

I see this in my hard wired devices..

image

I just triggered a light in home assistant but nothing happens then I see this after a while in the logs. Not much in the debug logs... it just stops responding.. Like it's frozen.

23-10-04T13:55:21.171Z SERIAL » 0x011100134d0a98804a02cb9138f7691c05414e                            (19 bytes)
2023-10-04T13:55:21.171Z DRIVER » [Node 077] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      65
                                  └─[SecurityCCNonceReport]
                                      nonce: 0x4a02cb9138f7691c
2023-10-04T13:55:21.172Z CNTRLR   The controller is unresponsive
2023-10-04T13:55:21.172Z DRIVER   Attempting to recover unresponsive controller...
2023-10-04T13:55:21.203Z CNTRLR   Performing soft reset...
2023-10-04T13:55:21.204Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T13:55:21.205Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-10-04T13:55:21.205Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:55:21.206Z DRIVER « [RES] [SendData]
                                    was sent: true
2023-10-04T13:55:21.319Z SERIAL « 0x011800134100000e03ca7f7f7f7f0101031b5917000201000027              (26 bytes)
2023-10-04T13:55:21.319Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:55:21.320Z DRIVER « [REQ] [SendData]
                                    callback id:            65
                                    transmit status:        OK, took 140 ms
                                    repeater node IDs:      27, 89, 23
                                    routing attempts:       1
                                    protocol & route speed: Z-Wave, 40 kbit/s
                                    routing scheme:         LWR
                                    ACK RSSI:               -54 dBm
                                    ACK RSSI on repeaters:  N/A, N/A, N/A
                                    ACK channel no.:        1
                                    TX channel no.:         1
2023-10-04T13:55:21.322Z SERIAL » 0x01030008f4                                                         (5 bytes)
2023-10-04T13:55:21.323Z DRIVER » [REQ] [SoftReset]
2023-10-04T13:55:21.323Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T13:55:21.324Z CNTRLR   Waiting for the controller to reconnect...
2023-10-04T13:55:21.325Z DRIVER   all queues idle
2023-10-04T13:55:22.825Z CNTRLR   Re-opening serial port...
2023-10-04T13:55:23.827Z CNTRLR   Waiting for the Serial API to start...
2023-10-04T13:55:24.151Z SERIAL « 0x010a000a03000102010121de                                          (12 bytes)
2023-10-04T13:55:24.152Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:55:24.152Z DRIVER « [REQ] [SerialAPIStarted]
                                    wake up reason:        WatchdogReset
                                    watchdog enabled:      false
                                    generic device class:  0x02
                                    specific device class: 0x01
                                    always listening:      false
                                    supports Long Range:   false
2023-10-04T13:55:24.153Z CNTRLR   Serial API started
2023-10-04T13:55:24.153Z CNTRLR   The controller is no longer unresponsive
2023-10-04T13:55:24.153Z DRIVER   all queues busy
2023-10-04T13:55:24.154Z SERIAL » 0x010500512d40c6                                                     (7 bytes)
2023-10-04T13:55:24.155Z DRIVER » [Node 045] [REQ] [AssignSUCReturnRoute]
                                    payload: 0x2d40
2023-10-04T13:55:24.156Z SERIAL « 0x01320004004d2a9881fcf81884edfcae1f5dbbb72036e89987387b3fc8f9b33d2 (52 bytes)
                                  f9a27f82f4353a34a331200bc748804dfd4002f
2023-10-04T13:55:24.158Z DRIVER   Dropping message with invalid payload
2023-10-04T13:55:24.159Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCCommandEncapsulation] [INVALID]
                                      error: Nonce 0x4a expired, cannot decode security encapsulated command.
niemyjski commented 1 year ago

I also see a lot of this:

2023-10-04T13:57:23.751Z CNTRLR   [Node 089] The node is alive.
2023-10-04T13:57:23.751Z CNTRLR   [Node 089] The node is ready to be used
2023-10-04T13:57:23.751Z CNTRLR « [Node 089] ping successful
2023-10-04T13:57:23.752Z CNTRLR » [Node 089] Assigning SUC return route...
2023-10-04T13:57:23.752Z CNTRLR » [Node 089] Deleting SUC return route...
2023-10-04T13:57:23.753Z SERIAL » 0x01080013300100252bdb                                              (10 bytes)
2023-10-04T13:57:23.755Z DRIVER » [Node 048] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      43
                                  └─[NoOperationCC]
2023-10-04T13:57:23.757Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T13:57:23.763Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-10-04T13:57:23.763Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:57:23.763Z DRIVER « [RES] [SendData]
                                    was sent: true
2023-10-04T13:57:23.816Z SERIAL « 0x011800132b00000501ca7f7f7f7f0101040d000000020100001b              (26 bytes)
2023-10-04T13:57:23.816Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:57:23.816Z DRIVER « [REQ] [SendData]
                                    callback id:            43
                                    transmit status:        OK, took 50 ms
                                    repeater node IDs:      13
                                    routing attempts:       1
                                    protocol & route speed: Z-Wave, 40 kbit/s
                                    routing scheme:         NLWR
                                    ACK RSSI:               -54 dBm
                                    ACK RSSI on repeaters:  N/A
                                    ACK channel no.:        1
                                    TX channel no.:         1
2023-10-04T13:57:23.817Z CNTRLR   [Node 048] The node is alive.
2023-10-04T13:57:23.817Z CNTRLR   [Node 048] The node is ready to be used
2023-10-04T13:57:23.818Z CNTRLR « [Node 048] ping successful
2023-10-04T13:57:23.818Z CNTRLR » [Node 048] Assigning SUC return route...
2023-10-04T13:57:23.818Z CNTRLR » [Node 048] Deleting SUC return route...
2023-10-04T13:57:23.819Z SERIAL » 0x010800131f0100252cf3                                              (10 bytes)
2023-10-04T13:57:23.820Z DRIVER » [Node 031] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      44
                                  └─[NoOperationCC]
2023-10-04T13:57:23.821Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T13:57:23.827Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-10-04T13:57:23.827Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:57:23.828Z DRIVER « [RES] [SendData]
                                    was sent: true
2023-10-04T13:57:23.847Z SERIAL « 0x011800132c00000200d47f7f7f7f00000300000000030100000f              (26 bytes)

Just saw this too... I'm running

Z-Wave JS Current version: 0.1.94 (Changelog)

2023-10-04T13:57:55.140Z DRIVER « [RES] [DeleteSUCReturnRoute]
                                    was executed: true
Z-Wave JS no longer supports enabling error reporting. If you are using an application that integrates with Z-Wave JS and you receive this error, you may need to update the application.

The second time I triggered the same light (node 32) as mentioned previously and saw this...

2023-10-04T13:59:28.829Z CNTRLR   The controller is unresponsive
2023-10-04T13:59:28.830Z DRIVER   Attempting to recover unresponsive controller...
2023-10-04T13:59:28.857Z CNTRLR   Performing soft reset...
2023-10-04T13:59:28.859Z SERIAL » 0x01030008f4                                                         (5 bytes)
2023-10-04T13:59:28.860Z DRIVER » [REQ] [SoftReset]
2023-10-04T13:59:28.860Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T13:59:28.862Z CNTRLR   Waiting for the controller to reconnect...
2023-10-04T13:59:28.862Z DRIVER   all queues idle
2023-10-04T13:59:30.362Z CNTRLR   Re-opening serial port...
2023-10-04T13:59:31.364Z CNTRLR   Waiting for the Serial API to start...
2023-10-04T13:59:31.690Z SERIAL « 0x010a000a03000102010121de                                          (12 bytes)
2023-10-04T13:59:31.691Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T13:59:31.692Z DRIVER « [REQ] [SerialAPIStarted]
                                    wake up reason:        WatchdogReset
                                    watchdog enabled:      false
                                    generic device class:  0x02
                                    specific device class: 0x01
                                    always listening:      false
                                    supports Long Range:   false
2023-10-04T13:59:31.692Z CNTRLR   Serial API started
2023-10-04T13:59:31.692Z CNTRLR   The controller is no longer unresponsive
2023-10-04T13:59:31.693Z DRIVER   all queues busy
2023-10-04T13:59:31.694Z SERIAL » 0x010500513c3ea9                                                     (7 bytes)
2023-10-04T13:59:31.694Z DRIVER » [Node 060] [REQ] [AssignSUCReturnRoute]
                                    payload: 0x3c3e
2023-10-04T13:59:31.695Z SERIAL « [ACK]                           
niemyjski commented 1 year ago

@AlCalzone It kind of feels like the controller is dead locked / stuck on processing causing the queue to fill up and nodes to become unavailable probably due to health checks? Again I'm running

ZW090 by AEON Labs Firmware: 1.2

(Gen 5)

AlCalzone commented 1 year ago

Do you have longer logs? These fragments are really hard to parse.

niemyjski commented 1 year ago

@AlCalzone that's all I'm seeing is fragments. Do you want me to go to the silly log level?

Like I refreshed logs and saw no activity for a while (nothing was triggering any state changes). Then triggered 3 lights to turn off and this is all I saw in the logs (I've included extra with timestamps):

2023-10-04T15:04:43.552Z CNTRLR   [Node 042] [~] [Notification] Home Security[Motion sensor status] [Endpoint 0]
                                  : 8 => 0
2023-10-04T15:05:14.593Z SERIAL « 0x01120004004e0a3202a1640001e4600000d2000f                          (20 bytes)
2023-10-04T15:05:14.594Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:14.594Z CNTRLR   [Node 078] [Meter] value[66561]: metadata updated                 [Endpoint 0]
2023-10-04T15:05:14.595Z CNTRLR   [Node 078] [~] [Meter] value[66561]: 123.917 => 124               [Endpoint 0]
2023-10-04T15:05:14.595Z DRIVER « [Node 078] [REQ] [ApplicationCommand]
                                  └─[MeterCCReport]
                                      type:       Electric
                                      scale:      V
                                      rate type:  Consumed
                                      value:      124
                                      time delta: 0 seconds
2023-10-04T15:05:14.680Z SERIAL « 0x010a0004004d029840ca00ac                                          (12 bytes)
2023-10-04T15:05:14.681Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:14
2023-10-04T15:05:14.681Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:05:14.751Z SERIAL « 0x01120004004e0a3202a16c000000000000d20082                          (20 bytes)
2023-10-04T15:05:14.752Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:14.752Z CNTRLR   [Node 078] [Meter] value[66817]: metadata updated                 [Endpoint 0]
2023-10-04T15:05:14.753Z CNTRLR   [Node 078] [~] [Meter] value[66817]: 0 => 0                       [Endpoint 0]
2023-10-04T15:05:14.753Z DRIVER « [Node 078] [REQ] [ApplicationCommand]
                                  └─[MeterCCReport]
                                      type:       Electric
                                      scale:      A
                                      rate type:  Consumed
                                      value:      0
                                      time delta: 0 seconds
2023-10-04T15:05:14.893Z SERIAL « 0x01160004004e0e320221640015442e003c0015442ed20036                  (24 bytes)
2023-10-04T15:05:14.894Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:14.895Z CNTRLR   [Node 078] [Meter] value[65537]: metadata updated                 [Endpoint 0]
2023-10-04T15:05:14.896Z CNTRLR   [Node 078] [~] [Meter] value[65537]: 1393.71 => 1393.71           [Endpoint 0]
2023-10-04T15:05:14.897Z DRIVER « [Node 078] [REQ] [ApplicationCommand]
                                  └─[MeterCCReport]
                                      type:        Electric
                                      scale:       kWh
                                      rate type:   Consumed
                                      value:       1393.71
                                      time delta:  60 seconds
                                      prev. value: 1393.71
2023-10-04T15:05:14.992Z SERIAL « 0x01120004004e0a32022174000000000000d2001a                          (20 bytes)
2023-10-04T15:05:14.993Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:14.994Z CNTRLR   [Node 078] [Meter] value[66049]: metadata updated                 [Endpoint 0]
2023-10-04T15:05:14.994Z CNTRLR   [Node 078] [~] [Meter] value[66049]: 0 => 0                       [Endpoint 0]
2023-10-04T15:05:14.995Z DRIVER « [Node 078] [REQ] [ApplicationCommand]
                                  └─[MeterCCReport]
                                      type:       Electric
                                      scale:      W
                                      rate type:  Consumed
                                      value:      0
                                      time delta: 0 seconds
2023-10-04T15:05:17.805Z SERIAL « 0x010a0004004d029840cd00ab                                          (12 bytes)
2023-10-04T15:05:17.806Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:17.806Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:05:17.806Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-04T15:05:20.908Z SERIAL « 0x010a0004004d029840ca00ac                                          (12 bytes)
2023-10-04T15:05:20.910Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:05:20.911Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:05:20.912Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-04T15:05:48.538Z SERIAL » 0x011100134d0a9880a20fcb5bd88e9da80540b9                            (19 bytes)
2023-10-04T15:05:48.538Z DRIVER » [Node 077] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      64
                                  └─[SecurityCCNonceReport]
                                      nonce: 0xa20fcb5bd88e9da8
2023-10-04T15:05:48.539Z DRIVER   Controller is still timing out. Restarting the driver...
Error in driver ZWaveError: Controller is still timing out. Restarting the driver... (ZW0100)
    at Driver.destroyWithMessage (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:2769:17)
    at fail (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3484:14)
    at Driver.handleUnresponsiveController (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:3493:4)
    at Driver.handleFailedTransaction (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:5521:13)
    at Driver.drainTransactionQueue (/usr/src/node_modules/zwave-js/src/lib/driver/Driver.ts:4613:10) {
  code: 100,
  context: undefined,
  transactionSource: undefined
}
Shutting down
Closing server...
2023-10-04T15:05:48.542Z DRIVER   destroying driver instance...
2023-10-04T15:05:48.542Z CNTRLR   [Node 089] Assigning SUC return route failed: Timeout while waiting for a call
                                  back from the controller (ZW0200)
Client disconnected
Code 1000: 
Server closed
2023-10-04T15:05:48.604Z DRIVER   driver instance destroyed
[15:05:48] WARNING: Halt add-on
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

When it restarted I see this:

2023-10-04T15:08:17.578Z SERIAL « 0x010a0004004d029840d400b2                                          (12 bytes)
2023-10-04T15:08:17.579Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:17.579Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:08:17.580Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-04T15:08:20.754Z SERIAL « 0x010a0004004d029840d400b2                                          (12 bytes)
2023-10-04T15:08:20.755Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:20.755Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:08:20.755Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-04T15:08:23.955Z SERIAL « 0x010a0004004d029840d400b2                                          (12 bytes)
2023-10-04T15:08:23.957Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:23.958Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:08:23.959Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-04T15:08:39.119Z SERIAL » 0x011100134d0a988031e6d04a0061b02d054056                            (19 bytes)
2023-10-04T15:08:39.122Z DRIVER » [Node 077] [REQ] [SendData]
                                  │ transmit options: 0x05
                                  │ callback id:      64
                                  └─[SecurityCCNonceReport]
                                      nonce: 0x31e6d04a0061b02d
2023-10-04T15:08:39.124Z CNTRLR   The controller is unresponsive
2023-10-04T15:08:39.127Z DRIVER   Attempting to recover unresponsive controller...
2023-10-04T15:08:39.143Z CNTRLR   Performing soft reset...
2023-10-04T15:08:39.149Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T15:08:39.151Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-10-04T15:08:39.152Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:39.153Z DRIVER « [RES] [SendData]
                                    was sent: true
2023-10-04T15:08:39.285Z SERIAL « 0x011800134000001001d47f7f7f7f01010308000000010100007a              (26 bytes)
2023-10-04T15:08:39.286Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:39.286Z DRIVER « [REQ] [SendData]
                                    callback id:            64
                                    transmit status:        OK, took 160 ms
                                    repeater node IDs:      8
                                    routing attempts:       1
                                    protocol & route speed: Z-Wave, 9.6 kbit/s
                                    routing scheme:         LWR
                                    ACK RSSI:               -44 dBm
                                    ACK RSSI on repeaters:  N/A
                                    ACK channel no.:        1
                                    TX channel no.:         1
2023-10-04T15:08:39.287Z SERIAL » 0x01030008f4                                                         (5 bytes)
2023-10-04T15:08:39.288Z DRIVER » [REQ] [SoftReset]
2023-10-04T15:08:39.288Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T15:08:39.289Z CNTRLR   Waiting for the controller to reconnect...
2023-10-04T15:08:39.290Z DRIVER   all queues idle
2023-10-04T15:08:40.791Z CNTRLR   Re-opening serial port...
2023-10-04T15:08:41.792Z CNTRLR   Waiting for the Serial API to start...
2023-10-04T15:08:42.117Z SERIAL « 0x010a000a03000102010121de                                          (12 bytes)
2023-10-04T15:08:42.118Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:42.118Z DRIVER « [REQ] [SerialAPIStarted]
                                    wake up reason:        WatchdogReset
                                    watchdog enabled:      false
                                    generic device class:  0x02
                                    specific device class: 0x01
                                    always listening:      false
                                    supports Long Range:   false
2023-10-04T15:08:42.118Z CNTRLR   Serial API started
2023-10-04T15:08:42.118Z CNTRLR   The controller is no longer unresponsive
2023-10-04T15:08:42.119Z DRIVER   all queues busy
2023-10-04T15:08:42.119Z SERIAL » 0x01050051323fa6                                                     (7 bytes)
2023-10-04T15:08:42.119Z DRIVER » [Node 050] [REQ] [AssignSUCReturnRoute]
                                    payload: 0x323f
2023-10-04T15:08:42.120Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T15:08:50.381Z SERIAL « 0x0104015101aa                                                       (6 bytes)
2023-10-04T15:08:50.381Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:50.381Z DRIVER « [RES] [AssignSUCReturnRoute]
                                    was executed: true
2023-10-04T15:08:50.382Z SERIAL « 0x010500003f04c1                                                     (7 bytes)
2023-10-04T15:08:50.383Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:50.383Z DRIVER « [REQ] [undefined]
                                    payload: 0x3f04
2023-10-04T15:08:50.383Z DRIVER   handling request undefined (0)
2023-10-04T15:08:50.383Z DRIVER     no handlers registered!
2023-10-04T15:08:56.615Z SERIAL « 0x010a0004004d029840d400b2                                          (12 bytes)
2023-10-04T15:08:56.615Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:56.615Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:08:59.755Z SERIAL « 0x010a0004004d029840d400b2                                          (12 bytes)
2023-10-04T15:08:59.758Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:08:59.759Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:08:59.760Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-04T15:09:02.954Z SERIAL « 0x010a0004004d029840d400b2                                          (12 bytes)
2023-10-04T15:09:02.955Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T15:09:02.955Z DRIVER « [Node 077] [REQ] [ApplicationCommand]
                                  └─[SecurityCCNonceGet]
2023-10-04T15:09:02.955Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport

Is there a place where I can get the entire log file that doesn't tail the last x records?

niemyjski commented 1 year ago
2023-10-04T19:53:28.882Z SERIAL « 0x01050055380097                                                     (7 bytes)
2023-10-04T19:53:28.882Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:28.883Z DRIVER « [REQ] [DeleteSUCReturnRoute]
                                    callback id:     56
                                    transmit status: OK
2023-10-04T19:53:28.884Z SERIAL » 0x010500553c39aa                                                     (7 bytes)
2023-10-04T19:53:28.884Z DRIVER » [Node 060] [REQ] [DeleteSUCReturnRoute]
                                    payload: 0x3c39
2023-10-04T19:53:28.885Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T19:53:28.893Z SERIAL « 0x0104015501ae                                                       (6 bytes)
2023-10-04T19:53:28.893Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:28.893Z DRIVER « [RES] [DeleteSUCReturnRoute]
                                    was executed: true
2023-10-04T19:53:29.476Z SERIAL « 0x01050055390096                                                     (7 bytes)
2023-10-04T19:53:29.477Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:29.477Z DRIVER « [REQ] [DeleteSUCReturnRoute]
                                    callback id:     57
                                    transmit status: OK
2023-10-04T19:53:29.479Z SERIAL » 0x010500552e3abb                                                     (7 bytes)
2023-10-04T19:53:29.479Z DRIVER » [Node 046] [REQ] [DeleteSUCReturnRoute]
                                    payload: 0x2e3a
2023-10-04T19:53:29.482Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T19:53:29.488Z SERIAL « 0x0104015501ae                                                       (6 bytes)
2023-10-04T19:53:29.488Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:29.488Z DRIVER « [RES] [DeleteSUCReturnRoute]
                                    was executed: true
2023-10-04T19:53:29.913Z SERIAL « 0x010500553a0095                                                     (7 bytes)
2023-10-04T19:53:29.914Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:29.914Z DRIVER « [REQ] [DeleteSUCReturnRoute]
                                    callback id:     58
                                    transmit status: OK
2023-10-04T19:53:29.916Z SERIAL » 0x01050055593bcd                                                     (7 bytes)
2023-10-04T19:53:29.916Z DRIVER » [Node 089] [REQ] [DeleteSUCReturnRoute]
                                    payload: 0x593b
2023-10-04T19:53:29.917Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T19:53:29.923Z SERIAL « 0x0104015501ae                                                       (6 bytes)
2023-10-04T19:53:29.923Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:29.923Z DRIVER « [RES] [DeleteSUCReturnRoute]
                                    was executed: true
2023-10-04T19:53:30.218Z SERIAL « 0x010500553b0094                                                     (7 bytes)
2023-10-04T19:53:30.218Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:30.218Z DRIVER « [REQ] [DeleteSUCReturnRoute]
                                    callback id:     59
                                    transmit status: OK
2023-10-04T19:53:30.220Z SERIAL » 0x010a00134c03250100253c94                                          (12 bytes)
2023-10-04T19:53:30.220Z DRIVER » [Node 076] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      60
                                  └─[BinarySwitchCCSet]
                                      target value: false
2023-10-04T19:53:30.221Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T19:53:30.228Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-10-04T19:53:30.228Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:30.228Z DRIVER « [RES] [SendData]
                                    was sent: true
2023-10-04T19:53:30.332Z SERIAL « 0x011800133c00000a03ca7f7f7f7f01010312210d000201000035              (26 bytes)
2023-10-04T19:53:30.332Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:30.332Z DRIVER « [REQ] [SendData]
                                    callback id:            60
                                    transmit status:        OK, took 100 ms
                                    repeater node IDs:      18, 33, 13
                                    routing attempts:       1
                                    protocol & route speed: Z-Wave, 40 kbit/s
                                    routing scheme:         LWR
                                    ACK RSSI:               -54 dBm
                                    ACK RSSI on repeaters:  N/A, N/A, N/A
                                    ACK channel no.:        1
                                    TX channel no.:         1
2023-10-04T19:53:30.334Z CNTRLR   [Node 076] [~] [Binary Switch] currentValue: false => false       [Endpoint 0]
2023-10-04T19:53:30.336Z SERIAL » 0x010500552f3dbd                                                     (7 bytes)
2023-10-04T19:53:30.336Z DRIVER » [Node 047] [REQ] [DeleteSUCReturnRoute]
                                    payload: 0x2f3d
2023-10-04T19:53:30.337Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T19:53:38.427Z SERIAL « 0x0104015501ae                                                       (6 bytes)
2023-10-04T19:53:38.427Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:38.428Z DRIVER « [RES] [DeleteSUCReturnRoute]
                                    was executed: true
2023-10-04T19:53:38.547Z SERIAL « 0x011800553d010335007f7f7f7f7f0001060000000002030000c1              (26 bytes)
2023-10-04T19:53:38.548Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:38.548Z DRIVER « [REQ] [DeleteSUCReturnRoute]
                                    callback id:     61
                                    transmit status: NoAck
2023-10-04T19:53:38.550Z SERIAL » 0x01050051353ea0                                                     (7 bytes)
2023-10-04T19:53:38.551Z DRIVER » [Node 053] [REQ] [AssignSUCReturnRoute]
                                    payload: 0x353e
2023-10-04T19:53:38.552Z SERIAL « [ACK]                                                                   (0x06)
2023-10-04T19:53:46.941Z SERIAL « 0x0104015101aa                                                       (6 bytes)
2023-10-04T19:53:46.942Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:46.942Z DRIVER « [RES] [AssignSUCReturnRoute]
                                    was executed: true
2023-10-04T19:53:46.944Z SERIAL « 0x010500553e0495                                                     (7 bytes)
2023-10-04T19:53:46.944Z SERIAL » [ACK]                                                                   (0x06)
2023-10-04T19:53:46.944Z DRIVER « [REQ] [DeleteSUCReturnRoute]
                                    callback id:     62
                                    transmit status: NoRoute
2023-10-04T19:53:46.945Z DRIVER   handling request DeleteSUCReturnRoute (85)
2023-10-04T19:53:46.946Z DRIVER     no handlers registered!
AlCalzone commented 1 year ago

Ok that last one is a firmware bug in the stick that has gotten unnoticed until very recently --> https://github.com/zwave-js/node-zwave-js/issues/6363 Seems to be related to your other log snippets though, so maybe that's it already.

niemyjski commented 1 year ago

Thanks! I sure hope so :D Spouse Acceptance Factor took a massive hit this past week.

mdeneen commented 1 year ago

I wanted to mention that this also affects me. Let me know if there's any logs I can provide to help -- the device in question is kind of odd, but it did work with version 11.14.2.

Sigma Designs (Former Zensys) HUSBZB-1

AlCalzone commented 1 year ago

That's good info, so it seems to be a general issue with the 500 series controller SDK, not limited to the Aeotec controller.

AlCalzone commented 1 year ago

fixed in https://github.com/zwave-js/node-zwave-js/pull/6370

mdeneen commented 1 year ago

fixed in #6370

I can confirm that this is working with my 500 series controller.

niemyjski commented 1 year ago

Awesome thanks! I'll be able to confirm once https://github.com/home-assistant/addons/pull/3245 is merged. It's worth noting that the usb stick went offline yesterday and I had to unplug it to get it to show back up as a usb device.

niemyjski commented 1 year ago

I just upgraded network is going through a lot of discovery it appears.

Seeing this so far upon first boot, is this normal?:

2023-10-06T10:30:50.850Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:30:51.474Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:30:52.352Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:30:52.679Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:30:52.851Z DRIVER   Dropping message with invalid payload
2023-10-06T10:30:52.902Z DRIVER   Dropping message with invalid payload
2023-10-06T10:30:52.957Z DRIVER   Dropping message with invalid payload
2023-10-06T10:30:55.837Z CNTRLR   [Node 002] The node is alive.
2023-10-06T10:30:55.837Z CNTRLR   [Node 002] The node is ready to be used
2023-10-06T10:30:55.840Z CNTRLR « [Node 002] ping successful
2023-10-06T10:30:58.292Z CNTRLR   [Node 037] The node is alive.
2023-10-06T10:30:58.292Z CNTRLR   [Node 037] The node is ready to be used
2023-10-06T10:30:58.294Z CNTRLR « [Node 037] ping successful
2023-10-06T10:30:58.836Z DRIVER   Dropping message with invalid payload
2023-10-06T10:30:58.908Z DRIVER   Dropping message with invalid payload
2023-10-06T10:30:58.974Z DRIVER   Dropping message with invalid payload
2023-10-06T10:30:59.168Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:08.347Z CNTRLR   [Node 068] The node is alive.
2023-10-06T10:31:08.347Z CNTRLR   [Node 068] The node is ready to be used
2023-10-06T10:31:08.348Z CNTRLR « [Node 068] ping successful
2023-10-06T10:31:09.968Z CNTRLR   [Node 073] The node is alive.
2023-10-06T10:31:09.970Z CNTRLR   [Node 073] The node is ready to be used
2023-10-06T10:31:09.972Z CNTRLR « [Node 073] ping successful
2023-10-06T10:31:10.234Z CNTRLR   [Node 054] The node is alive.
2023-10-06T10:31:10.234Z CNTRLR « [Node 054] ping successful
2023-10-06T10:31:10.235Z CNTRLR » [Node 054] querying node info...
New client
2023-10-06T10:31:52.215Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:52.449Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:55.933Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.074Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.125Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.365Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.382Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:57.393Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.415Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.432Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:57.449Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.464Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:57.488Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.574Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.678Z CNTRLR   [Node 068] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:31:57.883Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:57.966Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:58.032Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:58.265Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:58.456Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:58.577Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:59.244Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:59.310Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:59.377Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:59.915Z DRIVER   Dropping message with invalid payload
2023-10-06T10:31:59.985Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.065Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.105Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.119Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.136Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.171Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.203Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.259Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.332Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.347Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:00.547Z DRIVER   Dropping message with invalid payload
2023-10-06T10:32:10.203Z CNTRLR   [Node 060] Assigning SUC return route failed: Invalid callback received
2023-10-06T10:32:18.611Z CNTRLR   [Node 050] Route resolution failed
2023-10-06T10:32:18.614Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2023-10-06T10:32:27.089Z CNTRLR   [Node 089] Route resolution failed
2023-10-06T10:32:27.837Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:32:27.950Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:32:28.052Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:32:38.424Z CNTRLR   [Node 045] Route resolution failed
2023-10-06T10:32:46.591Z CNTRLR   [Node 051] Route resolution failed
2023-10-06T10:32:46.595Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2023-10-06T10:32:46.696Z CNTRLR   [Node 077] in the process of replying to a NonceGet, won't send another NonceR
                                  eport
2023-10-06T10:32:55.048Z CNTRLR   [Node 053] Route resolution failed
2023-10-06T10:32:55.052Z CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                                  n 100 ms.
2023-10-06T10:33:03.500Z CNTRLR   [Node 046] Route resolution failed
niemyjski commented 1 year ago

@AlCalzone seems to be working great so far! Thanks!

CheBuzz commented 1 year ago

@AlCalzone I am still experiencing the problem on 12.0.4. Here's my controller information:

2023-10-13T01:53:24.566Z DRIVER   version 12.0.4
2023-10-13T01:53:24.567Z DRIVER   
2023-10-13T01:53:25.659Z CONFIG   version 12.0.4
2023-10-13T01:53:27.758Z CNTRLR   querying Serial API capabilities...
2023-10-13T01:53:28.291Z CNTRLR   received API capabilities:
                                    firmware version:    3.7
                                    manufacturer ID:     0x86
                                    product type:        0x02
                                    product ID:          0x01
                                    supported functions: 
                                    · GetSerialApiInitData (0x02)
                                    · SetApplicationNodeInformation (0x03)
                                    · ApplicationCommand (0x04)
                                    · GetControllerCapabilities (0x05)
                                    · SetSerialApiTimeouts (0x06)
                                    · GetSerialApiCapabilities (0x07)
                                    · SoftReset (0x08)
                                    · SetRFReceiveMode (0x10)
                                    · FUNC_ID_ZW_SEND_NODE_INFORMATION (0x12)
                                    · SendData (0x13)
                                    · SendDataMulticast (0x14)
                                    · GetControllerVersion (0x15)
                                    · SendDataAbort (0x16)
                                    · FUNC_ID_ZW_R_F_POWER_LEVEL_SET (0x17)
                                    · UNKNOWN_FUNC_SEND_DATA_META (0x18)
                                    · FUNC_ID_ZW_GET_RANDOM (0x1c)
                                    · GetControllerId (0x20)
                                    · UNKNOWN_FUNC_MEMORY_GET_BYTE (0x21)
                                    · UNKNOWN_FUNC_MEMORY_PUT_BYTE (0x22)
                                    · UNKNOWN_FUNC_MEMORY_GET_BUFFER (0x23)
                                    · UNKNOWN_FUNC_MEMORY_PUT_BUFFER (0x24)
                                    · GetNodeProtocolInfo (0x41)
                                    · HardReset (0x42)
                                    · FUNC_ID_ZW_REPLICATION_COMMAND_COMPLETE (0x44)
                                    · FUNC_ID_ZW_REPLICATION_SEND_DATA (0x45)
                                    · AssignReturnRoute (0x46)
                                    · DeleteReturnRoute (0x47)
                                    · RequestNodeNeighborUpdate (0x48)
                                    · ApplicationUpdateRequest (0x49)
                                    · AddNodeToNetwork (0x4a)
                                    · RemoveNodeFromNetwork (0x4b)
                                    · FUNC_ID_ZW_CONTROLLER_CHANGE (0x4d)
                                    · FUNC_ID_ZW_SET_LEARN_MODE (0x50)
                                    · AssignSUCReturnRoute (0x51)
                                    · FUNC_ID_ZW_ENABLE_SUC (0x52)
                                    · FUNC_ID_ZW_REQUEST_NETWORK_UPDATE (0x53)
                                    · SetSUCNodeId (0x54)
                                    · DeleteSUCReturnRoute (0x55)
                                    · GetSUCNodeId (0x56)
                                    · UNKNOWN_FUNC_SEND_SUC_ID (0x57)
                                    · FUNC_ID_ZW_REQUEST_NODE_NEIGHBOR_UPDATE_OPTIONS (0x5a)
                                    · RequestNodeInfo (0x60)
                                    · RemoveFailedNode (0x61)
                                    · IsFailedNode (0x62)
                                    · ReplaceFailedNode (0x63)
                                    · GetRoutingInfo (0x80)
                                    · UNKNOWN_FUNC_LOCK_ROUTE_RESPONSE (0x90)
2023-10-13T01:53:28.296Z CNTRLR   querying version info...
2023-10-13T01:53:28.328Z CNTRLR   received version info:
                                    controller type: Static Controller
                                    library version: Z-Wave 2.78
2023-10-13T01:53:28.589Z CNTRLR   supported Z-Wave features: 
2023-10-13T01:53:28.589Z CNTRLR   querying controller capabilities...
2023-10-13T01:53:28.611Z CNTRLR   received controller capabilities:
                                    controller role:      primary
                                    is the SUC:           true
                                    started this network: true
                                    SIS is present:       true
                                    was real primary:     true
2023-10-13T01:53:28.611Z CNTRLR   finding SUC...
2023-10-13T01:53:28.631Z CNTRLR   This is the SUC
2023-10-13T01:53:28.632Z CNTRLR   querying additional controller information...
2023-10-13T01:53:28.696Z CNTRLR   received additional controller information:
                                    Z-Wave API version:         5 (legacy)
                                    Z-Wave chip type:           ZW0301
                                    node type                   Controller
                                    controller role:            primary
                                    controller is the SIS:      true
                                    controller supports timers: false

And here is the relevant line in debug logs where it's failing:

New client
2023-10-13T01:56:47.642Z SERIAL » 0x01030016ea                                                         (5 bytes)
2023-10-13T01:56:47.654Z DRIVER » [REQ] [SendDataAbort]
2023-10-13T01:56:47.662Z SERIAL « [ACK]                                                                   (0x06)
2023-10-13T01:56:47.874Z SERIAL « 0x010500130801e0                                                     (7 bytes)
2023-10-13T01:56:47.878Z SERIAL » [ACK]                                                                   (0x06)
2023-10-13T01:56:47.882Z DRIVER « [REQ] [SendData]
                                    callback id:     8
                                    transmit status: NoAck
2023-10-13T01:56:47.886Z DRIVER   handling request SendData (19)
2023-10-13T01:56:47.888Z DRIVER     no handlers registered!
2023-10-13T01:56:48.163Z CNTRLR   The controller is unresponsive
2023-10-13T01:56:48.169Z DRIVER   Attempting to recover unresponsive controller...
2023-10-13T01:56:48.275Z CNTRLR   Performing soft reset...
2023-10-13T01:56:48.283Z SERIAL » 0x01030008f4                                                         (5 bytes)
2023-10-13T01:56:48.283Z DRIVER » [REQ] [SoftReset]
2023-10-13T01:56:48.286Z SERIAL « [ACK]                                                                   (0x06)

...some time later...

2023-10-13T01:57:59.909Z SERIAL » 0x01030016ea                                                         (5 bytes)
2023-10-13T01:57:59.912Z DRIVER » [REQ] [SendDataAbort]
2023-10-13T01:57:59.919Z SERIAL « [ACK]                                                                   (0x06)
2023-10-13T01:58:00.419Z DRIVER   Controller is still timing out. Restarting the driver...
AlCalzone commented 1 year ago

Any chance you can capture what's happening before the first SendDataAbort? Your case seems to be slightly different, as the controller does actually reply with the missing callback after the abort - other examples where the controller was non-responsive didn't have this.

CheBuzz commented 1 year ago

Sorry, I tried to capture everything that looked relevant. Here's the full log: https://pastebin.com/iCt5fQ3E

AlCalzone commented 1 year ago

Ok this looks like the other issues I've seen and Z-Wave JS's behavior is intended, but it is not correct I'm afraid. I'm tracking this further in https://github.com/zwave-js/node-zwave-js/issues/6402

Maes152 commented 11 months ago

Apologies @AlCalzone for commenting here again after such a long time but one of my nodes is now repeatedly showing this behavior again. I'm running the latest version: zwave-js-ui: 9.3.2 zwave-js: 12.3.0 home id: 4080330103

Just want to reiterate that this node has been running fine for years and is quite close to my controller as well as many other nodes. The node will often go to unavailable and will revive almost instantaneously when I ping the node. This time I came prepared and have logging available below.

Entire log available here

2023-11-14T21:46:16.030Z DRIVER « [REQ] [SendData]
                                    callback id:            60
                                    transmit status:        NoAck, took 8490 ms
                                    routing attempts:       11
                                    protocol & route speed: Z-Wave, 40 kbit/s
                                    routing scheme:         Explorer Frame
                                    TX channel no.:         1
2023-11-14T21:46:16.042Z SERIAL » 0x010a00130203250100253cda                                          (12 bytes)
2023-11-14T21:46:16.043Z DRIVER » [Node 002] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      60
                                  └─[BinarySwitchCCSet]
                                      target value: false
2023-11-14T21:46:16.046Z SERIAL « [ACK]                                                                   (0x06)
2023-11-14T21:46:16.051Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-11-14T21:46:16.052Z SERIAL » [ACK]                                                                   (0x06)
2023-11-14T21:46:16.054Z DRIVER « [RES] [SendData]
                                    was sent: true`

`2023-11-14T21:46:21.989Z SERIAL « 0x011800133c010251007f7f7f7f7f0001070000000002070000e6              (26 bytes)
2023-11-14T21:46:21.991Z SERIAL » [ACK]                                                                   (0x06)
2023-11-14T21:46:21.993Z DRIVER « [REQ] [SendData]
                                    callback id:            60
                                    transmit status:        NoAck, took 5930 ms
                                    routing attempts:       7
                                    protocol & route speed: Z-Wave, 40 kbit/s
                                    routing scheme:         Explorer Frame
                                    TX channel no.:         1
2023-11-14T21:46:21.999Z SERIAL » 0x010a00130203250100253cda                                          (12 bytes)
2023-11-14T21:46:22.001Z DRIVER » [Node 002] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      60
                                  └─[BinarySwitchCCSet]
                                      target value: false
2023-11-14T21:46:22.003Z SERIAL « [ACK]                                                                   (0x06)
2023-11-14T21:46:27.429Z SERIAL « 0x0104011301e8                                                       (6 bytes)
2023-11-14T21:46:27.429Z SERIAL » [ACK]                                                                   (0x06)
2023-11-14T21:46:27.430Z DRIVER « [RES] [SendData]
                                    was sent: true
2023-11-14T21:46:27.444Z SERIAL « 0x010b0004000b03260300bb0066                                        (13 bytes)
2023-11-14T21:46:27.446Z SERIAL » [ACK]                                                                   (0x06)
2023-11-14T21:46:27.447Z CNTRLR   [Node 011] [~] [Multilevel Switch] currentValue: 0 => 0           [Endpoint 0]
2023-11-14T21:46:27.449Z DRIVER « [Node 011] [REQ] [ApplicationCommand]`

`2023-11-14T21:46:30.145Z CNTRLR   [Node 002] The node did not respond after 3 attempts, it is presumed dead
2023-11-14T21:46:30.155Z CNTRLR   [Node 002] The node is now dead.
2023-11-14T21:46:30.166Z SERIAL » 0x010e00131907600d0002250100253daf                                  (16 bytes)
2023-11-14T21:46:30.167Z DRIVER » [Node 025] [REQ] [SendData]
                                  │ transmit options: 0x25
                                  │ callback id:      61
                                  └─[MultiChannelCCCommandEncapsulation]
                                    │ source:      0
                                    │ destination: 2
                                    └─[BinarySwitchCCSet]
                                        target value: false
AlCalzone commented 11 months ago

@Maes152 have you read and followed https://zwave-js.github.io/node-zwave-js/#/troubleshooting/connectivity-issues?id=general-troubleshooting and https://zwave-js.github.io/node-zwave-js/#/troubleshooting/network-health yet?

There are several signs of bad connectivity in that log:

All in all, that data hints at a barely working mesh.

Maes152 commented 11 months ago

@AlCalzone Thank you for reviewing the logs and providing me with this feedback. I am of course anxious to do whatever is necessary to improve the health of the mesh.

A while back you have already advised me to use an extension cord because I was having an issue with 2 (new) nodes. This was solved after using the extension cord. Other than that I am not sure what to do or say. Node 2 (as the number implies) is my oldest node (ZW096). I've had it since I started using Zwave in 2016. During all that time the node and the controller have not moved location physically, but of course many new nodes have been added. I've performed multiple node and network rebuild routes. The behavior doesn't seem to change. While I obviously trust your analysis and judgement all I can see is that the node becoming unavailable is a very recent development. Has ZwaveJS become more strict in designating a node as unavailable? That would explain why this is becoming an issue now, even though my mesh may have been bad for years.

I've thoroughly read your documentation and honestly do not believe there are many (relevant) things I am doing wrong.

  1. My zwave controller is using a very long extension cord and is stuck to a wall, very high up (about 0.8m from the ceiling).
  2. The zwave controller is a ZW090 Aeotec Gen5. I've been thinking for a while I should upgrade to a newer version, would that solve any of my issues?
  3. I have a single sensor node that is using S0 security -- this is something I will fix.
  4. Node 2 is in the living room, literally surrounded by 9 other nodes none of which are on battery.

What, in your expert opinion, could I do next to remedy my situation? Rebuild the entire network from scratch? Replace the controller?

Thanks in advance!

PS: Not sure if helpful but here is a picture of the topology. The 2 red nodes are temporarily out-of-use: image

AlCalzone commented 11 months ago

Looking at that graph, I'm inclined to say the position of your controller may still not be ideal. Most of the nodes are not connected with 100 kbit/s. If changing the location does not help, you may want to look into setting priority routes. Those are routes that are always attempted first when sending a command to a device. It makes the most sense when forcing a direct connection at high speed. You can do that on the network map, select a node where you are relatively sure it should have a direct connection, select no repeaters and 100 kbit/s speed, then save the priority route and ping the device to confirm. If that helps, it may indirectly help your further-away devices too.

ZW090 Aeotec Gen5. I've been thinking for a while I should upgrade to a newer version, would that solve any of my issues?

Honestly, I'd currently not upgrade. The 700 series is a shitshow, especially in Europe, and upgrading to 800 series requires you to unpair and re-pair all your devices, at least for now.

niemyjski commented 11 months ago

Also, how does one set priority routes via ha?

AlCalzone commented 11 months ago

By using the Z-Wave JS UI addon ;)