whizzosoftware / WZWave

A native Java library for interfacing with Z-Wave PC controllers
Eclipse Public License 1.0
33 stars 22 forks source link

PAN16: Smart energy plug-in switch #18

Closed anton-johansson closed 6 years ago

anton-johansson commented 6 years ago

Product specifications:

This device seems to be working out of the box (some of the features at least), however it's giving some warnings that should be resolved. I'm happy to investigate and resolve it myself, this issue is for tracking the progress of those fixes.

Log output:

Stable Library
=========================================
Native lib Version = RXTX-2.2pre2
Java lib Version   = RXTX-2.1-7
WARNING:  RXTX Version mismatch
    Jar version = RXTX-2.1-7
    native lib Version = RXTX-2.2pre2
#onZWaveControllerInfo
Z-Wave 3.95
-4594
1
#onZWaveNodeAdded
ZWaveNode{nodeId=1, isListeningNode=true, nodeState=Started, isSleeping=false, available=null, commandClasses=NoOperationCommandClass{version=1}}
2017-12-26 21:45:21,028 WARN  com.whizzosoftware.wzwave.frame.transaction.SendDataTransaction:77 - Received unexpected frame for STATE_REQUEST_SENT: ZW_APPLICATION_COMMAND_HANDLER[0x02]: 0x32 0x02 0x21 0x34 0x00 0x00 0x01 0x2E 0x00 0x00 0x00 0x00 0x00 0x00 
2017-12-26 21:45:21,115 ERROR com.whizzosoftware.wzwave.product.ProductRegistry:156 - You are using a product that is not in WZWave's product registry. If the product is working properly with WZWave, please submit the following to the project so it can be added to the registry: Resolved Philio Technology Corporation Unknown from Manufacturer: 316, Product Type: 1, Product Id:41
2017-12-26 21:45:21,149 ERROR com.whizzosoftware.wzwave.channel.ZWaveChannelInboundHandler:75 - Received unknown data frame: SendData(0x06)[],11
2017-12-26 21:45:21,161 ERROR com.whizzosoftware.wzwave.product.ProductRegistry:156 - You are using a product that is not in WZWave's product registry. If the product is working properly with WZWave, please submit the following to the project so it can be added to the registry: Resolved Philio Technology Corporation Unknown from Manufacturer: 316, Product Type: 1, Product Id:41
#onZWaveNodeAdded
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=30.2, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=30.2, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=30.1, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=33.6, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=25.1, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=29.0, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=28.8, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=29.8, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=24.2, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=27.1, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=23.9, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=23.4, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=24.1, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}
#onZWaveNodeUpdated
ZWaveNode{nodeId=2, isListeningNode=true, nodeState=Started, isSleeping=false, available=true, commandClasses=NoOperationCommandClass{version=1}BasicCommandClass{version=1, value=null}AlarmCommandClass{version=1, type=0, level=0}ManufacturerSpecificCommandClass{version=1, productInfo=Philio Technology Corporation Unknown}MeterCommandClass{values={Watts=MeterReadingValue{type=Electric, currentValue=28.5, previousValue=null, delta=null}}}BinarySwitchCommandClass{version=1, isOn=true}VersionCommandClass{version=1, library='3', protocol='4.38', application='1. 3'}}

As you can see, the switch sends updates about the current Watts. It also sends the current on/off-status.

Worth noting: I'm using the Aeon Z-Stick Gen5.

By the way, could we have a label for "new device" or "unsupported device" so we can track these easier?

whizzosoftware commented 6 years ago

Thanks for submitting this. I have created a label called "new device" as requested. Let me know if you have any issues updating the product registry.

anton-johansson commented 6 years ago

@whizzosoftware: Do you have any pointers on where to start looking for fixing the errors and warnings that are showing up? What I would like to do is reproduce the flow in a unit test or integration test, so it is easier to fix, however, this is all very new to me.

Here is a Gist with logs from four different runs, all with different outputs:

  1. The first run has a few Received unexpected frame for STATE_REQUEST_SENT: ZW_APPLICATION_COMMAND_HANDLER[0x02]: 0x32 0x02 0x21 0x34 0x00 0x00 0x02 0x39 0x00 0x00 0x00 0x00 0x00 0x00 , followed by a Unable to find node: null. It also seems to receive an unknown CAN: Received unknown data frame: com.whizzosoftware.wzwave.frame.CAN@52b777a1.
  2. The second run seems to have the same issue as above, Received unexpected frame for STATE_REQUEST_SENT: ZW_APPLICATION_COMMAND_HANDLER[0x02]: 0x32 0x02 0x21 0x34 0x00 0x00 0x01 0xC1 0x00 0x00 0x00 0x00 0x00 0x00. It also has a resource-leak warning that should be checked at some point, but I assume that's unrelated to this.
  3. The third run does not have any errors or warnings.
  4. The fourth seems to be the same as the first, however, here we also see an unknown ACK received in STATE_CALLBACK_RECEIVED.

The series of 14 bytes seems to be the same except the 8th byte.

Edit: Looking at the MeterCommandClassTest-class, these bytes look very similar to those energy usage reports. In fact, it 100% looks like energy meter reports.

whizzosoftware commented 6 years ago

@anton-johansson: I think the first step here is to figure out the problems involving mishandled CAN frames - I believe a problem there will cause other errors. I have created #23 to track that. If you could gather some log examples that demonstrate the CAN problem and attach to issue 23, that would be very helpful.

anton-johansson commented 6 years ago

Cool, I'll dig further and update (when I have time, of course)! Thanks for the quick responses. :)

anton-johansson commented 6 years ago

There is a lot of bogus things happening to the transaction just before the CAN is received. The transaction ID is 881a5d23-098e-4c47-a751-72a043adf832. We recieve a SendData(0x04)[],3, which seem to timeout. It is a clear 3 second delay before we receive a log saying Detected transaction timeout. It looks like it tries to send it again, which also seem to timeout (this time after a little more than 2 seconds), reaching the maximum number of re-sends. AbstractDataFrame#getTimeout() returns 2 seconds as a timeout, so this could be a hint.

If I understand the code correctly, a SendData(0x04) means nodeId 4 (please correct me if I'm wrong). I never had any nodeId 4 in my network, only 1 (the PC controller) and 2 (the switch). That would explain timeouts.

I will dig further when I get home to my Z-Wave network and can do some more tests.

whizzosoftware commented 6 years ago

Sorry, I didn't notice the link to the logs the first time. You're right, the CAN seems to come later. Considering CAN a separate issue, here is how I interpret the first log you sent.

For reference, here are the frame format assumptions that WZWave is making: https://whizzosoftware.atlassian.net/wiki/spaces/WZWAV/pages/170622980/Serial+Frame+Format

Our first transaction of interest is a3e37641-7cfd-46fa-88b8-f6db93bfe4b7 (the sending of BASIC_GET to node 2). Here is the exchange (S is data sent from WZWave and R is data received by WZWave):

S: 0x01 0x09 0x00 0x13 0x02 0x02 0x20 0x02 0x05 0x03 0xC1 (SendData BASIC_GET to node 2)
R: 0x06 (ACK)
R: 0x01 0x04 0x01 (incomplete frame)
R: 0x13 0x01 0xE8 (completed SendData response frame)
S: 0x06 (ACK)
R: 0x01 0x07 0x00 (incomplete frame)
R: 0x13 0x03 0x00 0x00 0x03 0xEB (SendData callback from Node 3??)
S: 0x06 (ACK)
R: 0x01 0x09 0x00 (incomplete frame)
R: 0x04 0x00 0x02 0x03 0x20 0x03 0xFF 0x2F (Received application command frame from node 2)
S: 0x06 (ACK)

This is a successfully completed SendData transaction.

What is interesting here is the SendData callback appears to come from node 3. It almost seems like the 0x03 represents a callback ID rather than a node number. It doesn't cause a problem for the transaction but it is interesting.

Okay, the next transaction is the 881a5d23-098e-4c47-a751-72a043adf832 you mentioned:

S: 0x01 0x09 0x00 0x13 0x02 0x02 0x71 0x04 0x05 0x04 0x91 (SendData SENSOR_ALARM_GET to node 0x02)
R: 0x06 (ACK)
R: 0x01 0x04 0x01 0x13 (incomplete frame)
R: 0x01 0xE8 (completed SendData response frame)
S: 0x06 (ACK)
R: 0x01 0x07 0x00 0x13 0x04 0x00 0x00 0x03 0xEC (SendData callback from node 4??)
S: 0x06 (ACK)

In this case, the transaction timed out because no application command frame was received (see the description in SendDataTransaction.java for the expected sequence). Again we see an example of the SendData response frame having the wrong node ID which (increasingly less) coincidentally again corresponds to the callback ID. Anyway, since WZWave will reattempt a failed command, it tries to send the same command again:

S: 0x01 0x09 0x00 0x13 0x02 0x02 0x71 0x04 0x05 0x04 0x91 (SendData SENSOR_ALARM_GET to node 0x02)
R: 0x06 (ACK)
R: 0x01 0x04 0x01 0x13 0x01 0xE8 (completed SendData response)
S: 0x06 (ACK)
R: 0x01 0x07 0x00 (incomplete frame)
R: 0x13 0x04 0x00 0x00 0x02 0xED (SendData callback from node 4??)
S: 0x06 (ACK)
R: 0x01 0x14 0x00 0x04 0x00 0x02 0x0E 0x32 (incomplete frame)

What's interesting here is that WZWave, after waiting just under two seconds, receives a partial frame but the transaction timeout is hit before the rest is received.

So, since WZWave timed out the transaction, it moves onto the next command in the queue which is SWITCH_BINARY_GET:

S: 0x01 0x09 0x00 0x13 0x02 0x02 0x25 0x02 0x05 0x05 0xC2 (SendData SWITCH_BINARY_GET to node 0x02)
R: 0x02 0x21 0x34 0x00 0x00 0x00 0xDE 0x00 0x00 0x00 0x00 0x00 0x00 0x18
S: 0x06

Here we see that WZWave receives the remainder of the incomplete application command frame from the previous transaction (up to the 0xDE) then some 0x00 bytes and then a CAN (0x18). The CAN is most likely due to having sent the SWITCH_BINARY_GET command frame while the controller was still trying to send the remainder of the application command frame. The improper CAN handling means that the SWITCH_BINARY_GET is not sent again.

So I think there are three issues here:

  1. The transaction timeout should be extended each time new data is received (#24).
  2. Transactions are not properly cleaned up when they timeout (#25).
  3. The CAN is not properly handled (#23).
anton-johansson commented 6 years ago

Awesome, this helps a lot. Both the serial frame format page and your interpretations of the bytes being sent will be very educational.

In this case, the transaction timed out because no application command frame was received (see the description in SendDataTransaction.java for the expected sequence).

Okay, so what is actually causing this? Why is no application command frame being sent? Are the devices just that slow(1)? If so, if we just accept the fact that the transaction times out, how do we make sure that additional bytes aren't being read later, messing up the flow? I assume this is what you want fixed within #25?

(1): I am only getting the errors on startup, so it might be understandable to get delays here. If it's just responses to getting binary switches or similar reports, then I assume that is 100% okay to leave them timed out?

The CAN is most likely due to having sent the SWITCH_BINARY_GET command frame while the controller was still trying to send the remainder of the application command frame.

I assume this issue wouldn't really be an issue if #25 would be solved, am I correct? Sure, you could add some sort of fix for the misplaced CAN, but if we make sure bytes are received in the correct order, then it would be a good thing to still get a warning about a misplaced CAN.

whizzosoftware commented 6 years ago

I will try to answer your questions below to the best of my ability. Just keep in mind that it represents my current understanding of the serial protocol and some of my assumptions may very well be wrong :-)

Okay, so what is actually causing this? Why is no application command frame being sent? Are the devices just that slow(1)?

I could only guess here - could be RF interference, slow devices, serial port latency, etc. Furthermore there is definitely something different about the Z-Stick Gen5 because I rarely saw this sort of thing with the Series 2. I also rarely saw receiving of incomplete data frames with the Series 2 and it seems pretty common with the Gen5 as your logs are showing.

If so, if we just accept the fact that the transaction times out, how do we make sure that additional bytes aren't being read later, messing up the flow? I assume this is what you want fixed within #25?

Unfortunately, there is no way to avoid the additional bytes being read. I think the correct behavior here would be to properly close out the transaction at timeout, recognize that the additional bytes do not represent a complete data frame and discard everything (in this example) up to the CAN. I wonder if those extra 0x00 bytes and the CAN represent the controller trying to flush its buffers and start fresh after it recognized there was a problem.

If it's just responses to getting binary switches or similar reports, then I assume that is 100% okay to leave them timed out?

The responses are still needed to determine the current state of the devices. So it's a problem of striking the right balance between making a best effort to get the state vs. wasting time retrying for an answer that will never come (and properly recognizing situations where the answer will never come - such as sleeping nodes).

I assume this issue wouldn't really be an issue if #25 would be solved, am I correct?

It is less likely that we would see a CAN if the other problems were resolved. However, I think it's still possible and not handling the CAN properly is still a problem. In this case, it should have resulted in the SendData frame being sent again because the controller explicitly told us that it wasn't processed. In general, there is always the possibility that that the serial exchanges can get out of sync due to latency. WZWave should try its best to not drop commands when that happens.

anton-johansson commented 6 years ago

I could only guess here - could be RF interference, slow devices, serial port latency, etc. Furthermore there is definitely something different about the Z-Stick Gen5 because I rarely saw this sort of thing with the Series 2. I also rarely saw receiving of incomplete data frames with the Series 2 and it seems pretty common with the Gen5 as your logs are showing.

Allright. I guess we'll (me, in this case) have to live with this and try to make WZWave handle it.

Unfortunately, there is no way to avoid the additional bytes being read. I think the correct behavior here would be to properly close out the transaction at timeout, recognize that the additional bytes do not represent a complete data frame and discard everything (in this example) up to the CAN. I wonder if those extra 0x00 bytes and the CAN represent the controller trying to flush its buffers and start fresh after it recognized there was a problem.

Should this be treated as a separate issue, or is this what you mean in #25? I assume we will have to look at the previously sent bytes to see if the two parts belong together?

For the last two questions, I understand. Thanks for the quick replies!

Just to verify, you have also seen similar issues with your own Gen5? Just want to make sure there is no issue with my device. I could try resetting it. I've also looked for firmware updates, but it seems there are none available.

whizzosoftware commented 6 years ago

Should this be treated as a separate issue, or is this what you mean in #25? I assume we will have to look at the previously sent bytes to see if the two parts belong together?

Yes, I will put more detail into the issues I created. I created them quickly so I wouldn't forget about them. The frame handler already keeps around previous bytes from incomplete frames and stitches them together. The issue referred to in #25 is that the transaction fails and is aborted, the application frame comes in and WZWave still completes the transaction (even though it has already moved on by sending the next command).

Just to verify, you have also seen similar issues with your own Gen5? Just want to make sure there is no issue with my device. I could try resetting it. I've also looked for firmware updates, but it seems there are none available.

I definitely saw incomplete frame byte reads and CAN errors of some sort but I didn't have time to investigate before I moved. Unfortunately, I can't confirm any details right now - my Gen5 is a U.S. model and I'm currently living in Europe so I can't test with any actual devices presently.

anton-johansson commented 6 years ago

Yes, I will put more detail into the issues I created. I created them quickly so I wouldn't forget about them. The frame handler already keeps around previous bytes from incomplete frames and stitches them together. The issue referred to in #25 is that the transaction fails and is aborted, the application frame comes in and WZWave still completes the transaction (even though it has already moved on by sending the next command).

Allright, cool! I'll keep trying to mess around and see if I can come up with anything useful, but I'll take me a while to get familiar with all this.

I definitely saw incomplete frame byte reads and CAN errors of some sort but I didn't have time to investigate before I moved. Unfortunately, I can't confirm any details right now - my Gen5 is a U.S. model and I'm currently living in Europe so I can't test with any actual devices presently.

That's fair enough, just wanted to make sure it's not my device only. :)