openhab / openhab1-addons

Add-ons for openHAB 1.x
Eclipse Public License 2.0
3.43k stars 1.7k forks source link

Z-Wave Binding - Communication Breakdown - Protocol error (OOF) #3288

Closed DivineThreepwood closed 8 years ago

DivineThreepwood commented 8 years ago

Hey,

I have a strange issue with two Openhab + Zwave setups where the zwave networks crashes after around 1 - 42 hours. These setups are independent but showing the same behavior. The zwave log prints some protocol errors and afterwards the communication breaks down and the log continues printing:

2015-10-12 15:57:29.047 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.
2015-10-12 15:57:30.669 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:

Hardware and Versions

Z-Wave Controller: Z-Wave Plus USB Stick (ZME_UZB) Openhab Version: 1.6.2 Habmin Version: 1.7.0 Z-Wave Binding Version: 1.8.0

Log 1 (Setup 1)

2015-10-12 15:55:27.816 DEBUG o.o.b.z.i.p.s.DeleteReturnRouteMessageClass[:44]- NODE 22: Got DeleteReturnRoute response.
2015-10-12 15:55:27.816 ERROR o.o.b.z.i.p.s.DeleteReturnRouteMessageClass[:48]- NODE 22: DeleteReturnRoute command failed.
2015-10-12 15:55:27.816 DEBUG o.o.b.z.i.p.ZWaveController[:595]- Notifying event listeners: ZWaveNetworkEvent
2015-10-12 15:55:27.816 DEBUG o.o.b.z.i.ZWaveActiveBinding[:433]- ZwaveIncomingEvent
2015-10-12 15:55:30.643 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1278]- NODE 255: Too many retries. Discarding message: Message: class = DeleteReturnRoute (0x47), type = Request (0x00), payload = 16
2015-10-12 15:55:30.643 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1171]- Took message from queue for sending. Queue length = 0
2015-10-12 15:55:30.644 DEBUG o.o.b.z.i.p.SerialMessage[:233]- Assembled message buffer = 01 06 00 46 16 01 4D E5
2015-10-12 15:55:30.644 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1228]- NODE 255: Sending REQUEST Message = 01 06 00 46 16 01 4D E5
2015-10-12 15:55:33.040 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1441]- Receive Message = 01 04 01 46 00 BC
2015-10-12 15:55:33.043 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1365]- Receive queue ADD: Length=1
2015-10-12 15:55:33.043 DEBUG o.o.b.z.i.p.ZWaveController[:1123]- Receive queue TAKE: Length=0
2015-10-12 15:55:33.090 DEBUG o.o.b.z.i.p.SerialMessage[:233]- Assembled message buffer = 01 04 01 46 00 BC
2015-10-12 15:55:33.091 DEBUG o.o.b.z.i.p.ZWaveController[:1124]- Process Message = 01 04 01 46 00 BC
2015-10-12 15:55:33.091 DEBUG o.o.b.z.i.p.ZWaveController[:190]- Message: class = AssignReturnRoute (0x46), type = Response (0x01), payload = 00
2015-10-12 15:55:33.091 DEBUG o.o.b.z.i.p.s.AssignReturnRouteMessageClass[:43]- NODE 22: Got AssignReturnRoute response.
2015-10-12 15:55:33.092 ERROR o.o.b.z.i.p.s.AssignReturnRouteMessageClass[:48]- NODE 22: AssignReturnRoute command failed.
2015-10-12 15:55:33.092 DEBUG o.o.b.z.i.p.ZWaveController[:595]- Notifying event listeners: ZWaveNetworkEvent
2015-10-12 15:55:33.092 DEBUG o.o.b.z.i.ZWaveActiveBinding[:433]- ZwaveIncomingEvent
2015-10-12 15:55:33.092 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:645]- NODE 22: Network heal received FAILURE event
2015-10-12 15:55:33.093 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:432]- NODE 22: Node is asleep. Defer heal until it's awake
2015-10-12 15:55:33.696 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x05. Sending NAK.
2015-10-12 15:55:33.700 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x00. Sending NAK.
2015-10-12 15:55:33.704 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x47. Sending NAK.
2015-10-12 15:55:33.707 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x02. Sending NAK.
2015-10-12 15:55:35.647 ERROR o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1269]- NODE 255: Timeout while sending message. Requeueing - 2 attempts left!
2015-10-12 15:55:35.647 DEBUG o.o.b.z.i.p.ZWaveController[:580]- Enqueueing message. Queue length = 1
2015-10-12 15:55:40.648 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:55:45.649 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:55:50.650 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:55:55.651 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:55:56.153 DEBUG o.o.b.z.i.p.i.ZWaveNodeStageAdvancer[:1072]- NODE 47: Stage DETAILS. Initialisation retry timer triggered. Increased to 1800000
2015-10-12 15:55:56.154 DEBUG o.o.b.z.i.p.i.ZWaveNodeStageAdvancer[:267]- NODE 47: Node advancer - DETAILS: queue length(0), free to send(false)
2015-10-12 15:55:56.154 DEBUG o.o.b.z.i.p.i.ZWaveNodeStageAdvancer[:1104]- NODE 47: Initialisation retry timer started 1800000
2015-10-12 15:55:56.155 ERROR o.o.b.z.i.p.i.ZWaveNodeStageAdvancer[:314]- NODE 47: Node advancer: Retries exceeded at DETAILS
2015-10-12 15:55:56.155 DEBUG o.o.b.z.i.p.i.ZWaveNodeStageAdvancer[:327]- NODE 47: Retry timout: Can't advance
2015-10-12 15:56:00.652 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:56:05.653 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:56:10.654 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:56:15.655 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:57:15.666 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:57:20.667 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:57:24.045 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:341]- NODE 43: Sending periodic PING.
2015-10-12 15:57:24.045 DEBUG o.o.b.z.i.p.c.ZWaveNoOperationCommandClass[:72]- NODE 43: Creating new message for command No Operation
2015-10-12 15:57:24.046 DEBUG o.o.b.z.i.p.SerialMessage[:109]- NODE 43: Creating empty message of class = SendData (0x13), type = Request (0x00)
2015-10-12 15:57:24.046 DEBUG o.o.b.z.i.p.ZWaveController[:898]- Callback ID = 79
2015-10-12 15:57:24.046 DEBUG o.o.b.z.i.p.ZWaveController[:580]- Enqueueing message. Queue length = 2
2015-10-12 15:57:25.668 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:57:29.047 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.
2015-10-12 15:57:30.669 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:57:34.048 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.
2015-10-12 15:57:35.670 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-12 15:57:39.049 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.

Log (Setup 2)

2015-10-10 07:29:27.785 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1228]- NODE 255: Sending REQUEST Message = 01 04 00 47 09 B5
2015-10-10 07:29:27.795 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1441]- Receive Message = 01 04 01 47 01 BC
2015-10-10 07:29:27.798 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1365]- Receive queue ADD: Length=1
2015-10-10 07:29:27.799 DEBUG o.o.b.z.i.p.ZWaveController[:1123]- Receive queue TAKE: Length=0
2015-10-10 07:29:27.799 DEBUG o.o.b.z.i.p.SerialMessage[:233]- Assembled message buffer = 01 04 01 47 01 BC
2015-10-10 07:29:27.800 DEBUG o.o.b.z.i.p.ZWaveController[:1124]- Process Message = 01 04 01 47 01 BC
2015-10-10 07:29:27.800 DEBUG o.o.b.z.i.p.ZWaveController[:190]- Message: class = DeleteReturnRoute (0x47), type = Response (0x01), payload = 01
2015-10-10 07:29:27.800 DEBUG o.o.b.z.i.p.s.DeleteReturnRouteMessageClass[:44]- NODE 9: Got DeleteReturnRoute response.
2015-10-10 07:29:27.800 DEBUG o.o.b.z.i.p.s.DeleteReturnRouteMessageClass[:46]- NODE 9: DeleteReturnRoute command in progress.
2015-10-10 07:29:32.788 ERROR o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1269]- NODE 255: Timeout while sending message. Requeueing - 0 attempts left!
2015-10-10 07:29:32.788 DEBUG o.o.b.z.i.p.ZWaveController[:580]- Enqueueing message. Queue length = 2
2015-10-10 07:29:32.793 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1171]- Took message from queue for sending. Queue length = 1
2015-10-10 07:29:32.794 DEBUG o.o.b.z.i.p.SerialMessage[:233]- Assembled message buffer = 01 04 00 47 09 B5
2015-10-10 07:29:32.794 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1228]- NODE 255: Sending REQUEST Message = 01 04 00 47 09 B5
2015-10-10 07:29:32.800 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1441]- Receive Message = 01 04 01 47 00 BD
2015-10-10 07:29:32.802 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1365]- Receive queue ADD: Length=1
2015-10-10 07:29:32.803 DEBUG o.o.b.z.i.p.ZWaveController[:1123]- Receive queue TAKE: Length=0
2015-10-10 07:29:32.804 DEBUG o.o.b.z.i.p.SerialMessage[:233]- Assembled message buffer = 01 04 01 47 00 BD
2015-10-10 07:29:32.804 DEBUG o.o.b.z.i.p.ZWaveController[:1124]- Process Message = 01 04 01 47 00 BD
2015-10-10 07:29:32.804 DEBUG o.o.b.z.i.p.ZWaveController[:190]- Message: class = DeleteReturnRoute (0x47), type = Response (0x01), payload = 00
2015-10-10 07:29:32.804 DEBUG o.o.b.z.i.p.s.DeleteReturnRouteMessageClass[:44]- NODE 9: Got DeleteReturnRoute response.
2015-10-10 07:29:32.805 ERROR o.o.b.z.i.p.s.DeleteReturnRouteMessageClass[:48]- NODE 9: DeleteReturnRoute command failed.
2015-10-10 07:29:32.805 DEBUG o.o.b.z.i.p.ZWaveController[:595]- Notifying event listeners: ZWaveNetworkEvent
2015-10-10 07:29:32.805 DEBUG o.o.b.z.i.ZWaveActiveBinding[:433]- ZwaveIncomingEvent
2015-10-10 07:29:33.290 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x05. Sending NAK.
2015-10-10 07:29:33.294 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x00. Sending NAK.
2015-10-10 07:29:33.298 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x47. Sending NAK.
2015-10-10 07:29:33.302 WARN  o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:1468]- Protocol error (OOF). Got 0x02. Sending NAK.
2015-10-10 07:29:37.797 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1278]- NODE 255: Too many retries. Discarding message: Message: class = DeleteReturnRoute (0x47), type = Request (0x00), payload = 09
2015-10-10 07:29:42.798 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:29:47.799 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:29:52.800 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:29:57.801 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:02.802 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:07.803 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:12.804 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:17.804 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:22.805 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:27.806 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:32.807 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:37.808 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:42.809 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:47.810 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:52.811 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:30:57.812 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:02.813 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:07.814 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:10.494 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:341]- NODE 24: Sending periodic PING.
2015-10-10 07:31:10.494 DEBUG o.o.b.z.i.p.c.ZWaveNoOperationCommandClass[:72]- NODE 24: Creating new message for command No Operation
2015-10-10 07:31:10.494 DEBUG o.o.b.z.i.p.SerialMessage[:109]- NODE 24: Creating empty message of class = SendData (0x13), type = Request (0x00)
2015-10-10 07:31:10.495 DEBUG o.o.b.z.i.p.ZWaveController[:898]- Callback ID = 114
2015-10-10 07:31:10.495 DEBUG o.o.b.z.i.p.ZWaveController[:580]- Enqueueing message. Queue length = 2
2015-10-10 07:31:12.815 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:15.496 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.
2015-10-10 07:31:17.816 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:20.496 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.
2015-10-10 07:31:22.816 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:25.497 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.
2015-10-10 07:31:27.817 WARN  o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:1163]- Receive queue TIMEOUT:
2015-10-10 07:31:30.498 DEBUG o.o.b.z.i.ZWaveNetworkMonitor[:315]- Network Monitor: Queue length is 2 - deferring network monitor functions.

Got the same behavior with zwave binding 1.7.0. Openhab and the other bindings performing normal without getting any errors. Openhab still logs the state changes if lamps are switch on or off via the android app. But the zwave binding log and the hardware shows no reaction. And even after the failure the zwave node seems to be able to reach the controller, at least my fibaro smoke detectors triggers no "can not reach main controller" alert.

Has some a idea whats happening?

Cheers, DivineThreepwood

DivineThreepwood commented 8 years ago

Hey Chris,

any Idea what's going wrong?

Cheers, Divine

cdjackson commented 8 years ago

Not really. This sort of error is a very low level comms error between the binding and the stick. I’ve normally only seen it when there’s some strange timing going on - maybe you’re using a slow computer, or a VM or something? (you don’t mention the hardware or environment). It normally happens if there are bytes being dropped occasionally between the stick and the binding...

Unfortunately I don’t have any good ideas on how to fix it though.

DivineThreepwood commented 8 years ago

Hey that's a good hint!

I'am running OpenHAB in a vm with kvm. May there is an issue with the usb pass-through configuration. I will have a look at this end of the week.

Thanks!

jwveldhuis commented 8 years ago

Had the same error today, in my case it was caused by a mixup of USB devices at restart. /dev/ttyUSB0 used to be the AEON stick, but at restart it became /dev/ttyUSB1.

/dev/ttyUSB0 was connected to a USB to serial device (connected to my Smart Energy meter), so it did some non-Zwave serial communication causing these very strange errors.

It could be the case you encountered the same.

DivineThreepwood commented 8 years ago

So it really seems to be a virtual machine issue. After moving my distribution to the host machine, the error does not occur again.

@jwveldhuis Thanks for your hint but on the one hand the zwave stick is the only usb device which is connected to server and on the other openhab is linked to /dev/zwave which is aliased to the stick to avoid your issue (just checkout https://github.com/openhab/openhab/wiki/symlinks).

Chris do you have some more experiences in running openhab in a virtual maschine with a zwave stick pass-through? Or do you know some related issues?

Cheers! Marian

jwveldhuis commented 8 years ago

@DivineThreepwood yes you are right, an incorrect USB device cant be the issue here as it happens over time. I did not properly read the problem description, sorry about that. Symlinking was the solution for me indeed, thanks for pointing that out.

cdjackson commented 8 years ago

Sorry - I didn't see the question there for me... No - I don't have any experience with running this in a VM - sorry.

I suggest to close this and continue the discussion if necessary on the forum.