OpenZWave / qt-openzwave

QT5 Wrapper for OpenZWave
GNU Lesser General Public License v3.0
105 stars 30 forks source link

`AddNode` failing with timeout #160

Closed michaelolson closed 3 years ago

michaelolson commented 3 years ago

I'm running an Aeotec Z-Stick with the latest qt-openzwave-allinone Docker image.

I can no longer add any nodes to my network. I get two types of likely related errors. In both cases, this is after selecting Add Node and then selecting that I do want to include the device with encryption.

In the first type of error, the ozw-admin, run via VNC over the web client or remotely connecting VNC, hangs at Waiting for the Controller to Enter AddNodeMode and never returns. The logs from that situation look like this:

Event List:

21:03:32 ZWave Controller Command Executed: Ctrl_Cmd_AddNode State: Ctrl_State_Starting Error: Ctrl_Error_None
21:03:42 ZWave Network Notification Received: Notification_Code_MsgTimeout
21:03:42 ZWave Controller Command Executed: Ctrl_Cmd_AddNode State: Ctrl_State_Error Error: Ctrl_Error_Failed

OZW Logs (no seconds, number after time is the node):

21:03:??   0 Queuing (Controller) Add Device
21:03:??   0 Add Device
21:03:?? 255 Queuing (Command) ControllerCommand_AddDevice: 0x01, 0x05, 0x00, 0x4a, 0xc1, 0x8c, 0xfd
21:03:??   0 Notification: AddDevice ControllerCommand - Starting
21:03:?? 255 Sending (Command) message (Callback ID=0x8c, Expected Reply=0x4a) - ControllerCommand_AddDevice: 0x01, 0x05, 0x00, 0x4a, 0xc1, 0x8c, 0xfd
21:03:?? 255 Encrypted Flag is 0
21:03:?? 255 ERROR: Dropping command, expected response not received after 1 attempt(s). Command: "ControllerCommand_AddDevice: 0x01, 0x05, 0x00, 0x4a, 0xc1, 0x8c, 0xfd"
21:03:?? 255 Removing current message
21:03:?? 255 Notification: Notification - TimeOut
21:03:??   0 Notification: ControllerCommand - Error - Failed

In the second type of error, the ozw-admin initiates the AddNode command and almost immediately returns saying that the AddNode command is done, though no nodes have been added and none are expected to be in a state for network entry. The logs from that situation look like this:

Event List:

21:21:12 ZWave Controller Command Executed: Ctrl_Cmd_AddNode State: Ctrl_State_Starting Error: Ctrl_Error_None
21:21:21 ZWave Controller Command Executed: Ctrl_Cmd_AddNode State: Ctrl_State_Failed Error: Ctrl_Error_None
21:21:22 ZWave Controller Command Executed: Ctrl_Cmd_AddNode State: Ctrl_State_Completed Error: Ctrl_Error_None

OZW Logs:

21:21:??   0 Queuing (Controller) Add Device
21:21:??   0 Add Device
21:21:?? 255 Queuing (Command) ControllerCommand_AddDevice: 0x01, 0x05, 0x00, 0x4a, 0xc1, 0x7a, 0x0b
21:21:??   0 Notification: AddDevice ControllerCommand - Starting
21:21:?? 255 Sending (Command) message (Callback ID=0x7a, Expected Reply=0x4a) - 21:21:?? 255 ControllerCommand_AddDevice: 0x01, 0x05, 0x00, 0x4a, 0xc1, 0x7a, 0x0b
21:21:?? 255 Encrypted Flag is 0
21:21:?? 255   Received: 0x01, 0x07, 0x00, 0x4a, 0x7a, 0x07, 0x00, 0x00, 0xcf
21:21:?? 255 FUNC_ID_ZW_ADD_NODE_TO_NETWORK:
21:21:?? 255 ADD_NODE_STATUS_FAILED
21:21:?? 255 Removing current message
21:21:?? 255 Queuing (Command) Add Node Stop: 0x01, 0x04, 0x00, 0x4a, 0x05, 0xb4
21:21:??   0 Notification: AddDevice ControllerCommand - Failed
21:21:?? 255 Sending (Command) message (Callback ID=0x00, Expected Reply=0x4a) - Add 21:21:?? 255 Node Stop: 0x01, 0x04, 0x00, 0x4a, 0x05, 0xb4
21:21:?? 255 Encrypted Flag is 0
21:21:?? 255   Received: 0x01, 0x07, 0x00, 0x4a, 0x7a, 0x06, 0x00, 0x00, 0xce
21:21:?? 255 FUNC_ID_ZW_ADD_NODE_TO_NETWORK:
21:21:?? 255   Expected reply was received
21:21:?? 255   Message transaction complete
21:21:?? 255 Removing current message
21:21:??   0 Notification: AddDevice ControllerCommand - Completed  

In both cases, it doesn't seem that I am in a situation where I can add any new nodes to my network. Everything else appears to work fine. I can receive updates, I can change the state of devices in the network, I just can't add new nodes to the network. I have tried removing and re-adding the Docker image, I've tried rebooting the host Raspberry Pi device, I've tried disconnecting the Aeotec Z-Stick (which, admittedly, has a battery and I'm not sure how to fully turn off). Nothing seems to alter the types of errors I observe.

Is there any way to determine what's going on here and how I can make forward progress?

michaelolson commented 3 years ago

Does anyone have any ideas here? I'm completely blocked from adding new nodes to the network by the above issue.

kpine commented 3 years ago

You could try using the Windows PC Controller software and see if it works. That could help determine if it's a problem with openzwave or the controller hardware.

Are you using a Raspberry Pi 3 or 4?

michaelolson commented 3 years ago

I get the same results using the built in controller over VNC or the controller software on Windows. I tried both prior to filing, and the error messages are copied from the Windows controller software where it's far easier to copy/paste the error messages.

This is on a Raspberry Pi 3. I can't get the Aeotec Z-Stick to behave correctly on a 4, which seems to be a known issue.

kpine commented 3 years ago

I mean the Silicon Labs PC Controller software, not ozw-admin. The idea would be to eliminate OZW as the problem. It's available here, you have to create an account and agree to their terms to download it.

Yes, the Pi 4 and older Aeotec sticks is a problem, which is why I asked. Pi 3 is fine.

michaelolson commented 3 years ago

Ahh, I misunderstood. Thanks for the clarification. I'll try that out and report back.

michaelolson commented 3 years ago

I'm not sure what's different about the Silicon Labs PC Controller software, but I was able to use it to execute commands against the network. I removed some stuck failed nodes, refreshed the remaining nodes, put the controller back in the Raspberry Pi, restarted the container, and things started working again.