Closed robinsmidsrod closed 3 years ago
This error is something on hass side, please open this kind of issues on hass repo
@robertsLando You couldn't possibly have managed to read the entire issue text this fast. I've explained further down in the text that this doesn't require any changes to Home Assistant to troubleshoot. Did the websocket API for thermostats change in zwavejs2mqtt 1.2.x? Also, the issue with the sensors and compat 0x31 CC shouldn't be related to Home Assistant.
@robinsmidsrod Sorry, I got like 4/5 issues with the exact same title (and more are coming for sure).
Last 1.2.0/1.2.1 versions uses zwavejs 6.4.0, on zwavejs 6.3.0 some devices calss names have changed and this has broke climates discovery (that is handelde on hass side by the zwavejs integration). By reading the title I was sure it was something related to that.
@robertsLando The reason I'm using HA here to show the issue, is that I don't have the websocket debug client installed because I'm running from Docker, and I just spent 4-5 hours debugging this issue to try and figure out exactly which component is at fault here. And from what I can tell, the sensor failure is node-zwave-js territory, the missing climate entity is most likely related to HA/zwavejs2mqtt websocket communication.
@robertsLando Thanks for reopening. :) Is there a relevant issue in the Home Assistant repo tracking this issue with the breakage of climate discovery?
@robinsmidsrod In order to debug this, I need both z2m and zwavejs logs. I will tag @AlCalzone for the zwavejs side debugging as by reading the issue seems that there could be some problems with the device config or else.
There are plenty of issues here: https://github.com/home-assistant/core/issues?q=is%3Aissue+is%3Aopen+label%3A%22integration%3A+zwave_js%22
All regarding issues with climates
BTW that messsage in log has no sense as if you are using the websocket client with the hass zwavejs integration you can disable the mqtt gateway (and the mqtt discovery by conseguence)
@robertsLando I think the sensor issue might be related to the "unofficial" version 4.2 firmware floating around, which was required on OZW 1.6 to get the thermostat working with Home Assistant. I'm using the official version 4.0 firmware on mine.
What do you need me to do while getting those logs? Is it enough to just restart zwavejs2mqtt 1.2.1 and let it fully load the mesh? Or do I need to run re-interview or refresh on the node in question?
I have had MQTT gateway disabled since the start, but the message mentioned still shows up.
v1.1.1: "zwave-js": "^6.2.0"
v1.2.1: "zwave-js": "^6.4.0"
In v6.3.0, the Device Class labels were changed to match the Z-Wave+ specifications. For some reason, HA uses the labels instead of fixed numeric keys to detect which class a device has and to offer the capabilities, which was broken by the label change. I'm relatively sure you're also experiencing the symptoms of this change.
@AlCalzone I've updated the post with the correct zwavejs version number, as you mentioned.
I have had MQTT gateway disabled since the start, but the message mentioned still shows up.
I think you have disabled it but the discovery is still enabled. Could you please try to enable mqtt gateway in settings, disable thje mqtt discovery switch, then disable again the mqtt gateway and press save?
@robertsLando I'm in the process of generating the logs now (debug level for both). As I asked earlier, do you want me to do anything in particular while generating them, or just let the current mesh fully load?
Just let it load - we should see most of the relevant info there.
@AlCalzone @robertsLando Here are the logs you requested. The thermostat I've tried to do a lot of stuff with is node 10, while node 13 I've tried to keep unmodified since 1.1.1. Since node 13 showed up as dead I power-cycled it (which also impacted node 12) to get it back online. Hopefully this shouldn't distort the logs too much for you to see what's going on.
There's a lot of nonce spamming going on in your log with several of your devices, for example:
// Outgoing request
14:10:48.427 CNTRLR » [Node 010] querying current value of setpoint Heating...
14:11:46.278 DRIVER » [Node 010] [REQ] [SendData]
│ transmit options: 0x25
│ callback id: 123
└─[SecurityCCNonceGet]
14:11:46.870 DRIVER « [Node 010] [REQ] [ApplicationCommand]
└─[SecurityCCNonceReport]
nonce: 0x6d19196cb490b29c
14:11:46.873 DRIVER » [Node 010] [REQ] [SendData]
│ transmit options: 0x25
│ callback id: 124
└─[SecurityCCCommandEncapsulation]
│ nonce id: 109
└─[ThermostatSetpointCCGet]
setpoint type: Heating
// Node requests nonce for the response
14:11:50.726 DRIVER « [Node 010] [REQ] [ApplicationCommand]
└─[SecurityCCNonceGet]
14:11:50.730 DRIVER » [Node 010] [REQ] [SendData]
│ transmit options: 0x05
│ callback id: 125
└─[SecurityCCNonceReport]
nonce: 0x680cea33493e2230
14:11:50.739 DRIVER « [RES] [SendData]
was sent: true
14:11:50.763 DRIVER « [REQ] [SendData]
callback id: 125
transmit status: OK
// And again 200ms later after it has acknowledged the receipt
14:11:50.944 DRIVER « [Node 010] [REQ] [ApplicationCommand]
└─[SecurityCCNonceGet]
14:11:50.948 SERIAL » 0x011100130a0a9880a37277499ded67e2057e84 (19 bytes)
14:11:50.948 DRIVER » [Node 010] [REQ] [SendData]
│ transmit options: 0x05
│ callback id: 126
└─[SecurityCCNonceReport]
nonce: 0xa37277499ded67e2
14:11:50.956 DRIVER « [RES] [SendData]
was sent: true
14:11:51.203 DRIVER « [REQ] [SendData]
callback id: 126
transmit status: OK
// Which it also received ^
The problem here is that the specifications mandate that only one nonce may be valid at any given time and it may only be used once. Therefore as soon as the second nonce is requested, the first is invalidated. However, the device responds with an encrypted message that uses the first (now invalid) nonce. And this happens multiple times:
14:11:52.787 DRIVER Dropping message with invalid data (Reason: Nonce 0x68 expired, cannot decode
security encapsulated command.):
...
14:11:53.291 DRIVER Dropping message with invalid data (Reason: Nonce 0x68 expired, cannot decode
security encapsulated command.):
This causes the values not to be received and stored and thus they don't show up in HomeAssistant. However this behavior did not change since 6.2.0, so it is likely a coincidence you notice now.
@AlCalzone Yeah, I noticed that too. I'm assuming the only way to solve this is to exclude and include again the offending nodes, correct? Does this mean that the zwave mesh is overloaded, or that the device is not properly supporting SecurityCC? I was hoping to avoid another exclude/include, as it changes the device identifier in Home Assistant, requiring me to redo any device automations I have.
@AlCalzone Just FYI, almost all of these nodes were included (securely) way back when I was running OZW 1.4. I have just transferred them over with the same network key through OZW 1.6 into zwavejs. Node 15 and 16 has been included while running zwavejs.
Basically, the devices are not behaving correctly. They should be waiting until they've sent the encrypted message before they request a new one. OZW does not care about invalidating nonces, which is why most of the affected devices work correctly there. This is a no-go, as it allows replay attacks etc.
I would start by waiting until things have settled down, then re-interview the affected devices one by one, hoping they don't bug out.
On a side note: People with such misbehaving devices should really start bugging the manufacturers to fix their stuff. Thermofloor/Heatit in particular... So many bugs with only a handful of devices...
@AlCalzone I'll try to bug Thermofloor/Heatit with what you describe above, and see what they say. I guess there is one alternative solution, and that is to turn off security completely. For a thermostat, that is generally something I'm quite hesitant to do, as hacking of a thermostat device can potentially trigger a fire hazard.
@AlCalzone I've sent a message to Heatit about the nonce security issue, and mentioned that it exists on both the Z-TRM3 (thermostat) and ZDim (dimmer), requesting a new firmware for both of the unit types. Let's see how they respond. I've included a reference to this conversation in the email. Crossing my fingers.
@AlCalzone I've just received response from Heatit, and they claim that they are not in violation of the specification and that they pass the most stringent S2 security requirements.
Do you have a pointer to the exact location in the Z-Wave specification where ZWaveJS's use of nonce is described? I've responded to them that there is, obviously, misunderstanding in how to interpret that section of the specification, since devices and controllers implement it differently. It would greatly help my conversation with Heatit.
I've also encouraged them to contact you directly. If you want to reach out, you can reach them at post@heatit.com, and my case number, if you want to reference it, is 74485. Right now, they're in need of more documentation regarding your claim that they violate the specification, so that they can take it up with ZWave Alliance/SiLabs.
they claim that they are not in violation of the specification and that they pass the most stringent S2 security requirements.
I've never talked about S2. This is about the S0 implementation, which requires requesting a nonce for every message that should be transmitted. SDS10865-11, page 6, chapter 5.2.1 "Initialization Vector - Nonce":
The Initialization Vector, or nonce (i.e., a number used once) is a publicly known value. In the security layer, nonces have two properties:
- They are fresh, i.e., the attacker can not predict them before they were generated, and theyremain valid only a short time. Thus, they can be used to check whether a message associatedwith the nonce is also fresh.
- They are used only once, i.e., the attacker does not gain anything from storing a tuple (message,nonce) in the hope that the same nonce is used again.
Judging from the retransmission attempt with the same nonce in your log above, 2. is not followed (notice the double message with the same nonce):
14:11:52.787 DRIVER Dropping message with invalid data (Reason: Nonce 0x68 expired, cannot decode
security encapsulated command.):
...
14:11:53.291 DRIVER Dropping message with invalid data (Reason: Nonce 0x68 expired, cannot decode
security encapsulated command.):
there is, obviously, misunderstanding in how to interpret that section of the specification
I remember from my discussions with my contact at Silabs that only a single nonce must be active at a given time. Thus requesting a new nonce currently invalidates all previous ones for this node. This is also how I understood the specs when I implemented S0, but I cannot find the requirement right now.
I'm going to reassure myself that is really the correct behavior.
I've talked with my contact again and have verification that zwave-js is doing the right thing here. If this double nonce request is meant to encrypt multiple messages, the SECURITY_MESSAGE_ENCAPSULATION_NONCE_GET
command should be used instead.
@robinsmidsrod I've sent them a mail, lets see what happens.
I'm experiencing exactly same issue. I got 3 Heatit Z-TRM3 thermostats and after upgrading zwavejs2mqtt
from v1.1.0
to v1.2.3
yesterday (Home Assistant was at same version 2021.2.3
before and after upgrade) the thermostat entity for all 3 devices is in Unavailable
status in Home Assistant and I cant control thermostats anymore. All three thermostats worked as expected in zwavejs2mqtt
v1.1.0
. All 3 devices are added as non-secured nodes.
Things I have tried to fix the problem, without luck:
After removing and adding the device the only entity I can find in HA is "Air temperature"
@ismarslomic node-zwave-js 6.3+ is incompatible with the current HA release. Downgrade zwavejs2mqtt or wait for the next beta (probably tomorrow).
Tnx for quick reply! I will wait for tomorrow then, not a big issue! Just wanted to help out with confirming the issue. Please let me know if you need support on digging more into details and logs from my side!
Side question, ZWave To Mqtt UI display the Product code for my HeatIt Z-TRM3 devices as TF016
. Is this correct?
I think that is a different device - at least the configuration parameters are completely different. See https://devices.zwave-js.io/ If you're sure, please open a separate issue, so we can track that.
I need to demount my devices from the wall to verify. It shows that they are labeled with "TF 021". I sent an email to HeatIt and they informed that "TF 021" and "TF 016" are the same product, just differently labeled. So it means that this issue is also relevant for TF016 and not only Z-TRM3.
And TF 021
is the same as Z-TRM3
?
I don´t think they are identical. Both are thermostats, but different versions/models. HeatIt sent me Installers Manual for my product and it is not labeled Z-TRM3
but Firmware 1.8
. I'm quite confused myself, because it is not possible to find any products on their web site labeled with TF 021
nor TF 016
. However, I think that mapping to TF 016 in zwavejs should work fine.
Oh wow, that is confusing. Just let us know if your device isn't detected correctly and/or has wrong config parameters, so we can update the files to match.
@robinsmidsrod I've had contact with the HeatIt support. They can't reproduce with the PC controller. Unfortunately I don't have the device. Maybe you could try to reproduce with that and generate a log? Requesting the heating setpoint would be a start.
@AlCalzone Not sure I know which software you're referring to when you say PC controller. Can you provide a link?
@AlCalzone If you're referring to this one, https://aeotec.freshdesk.com/support/solutions/articles/6000226205-z-wave-command-class-configuration-tool-download- , should I download version 5 or is it enough with the older v4 at the bottom of the page?
@robinsmidsrod It is part of Simplicity Studio, which you'll find under https://www.silabs.com/developers/z-wave -> downloads. I've also tried to contact you on the HA discord, are you still active there?
@AlCalzone Not so active on Discord, but I found your communication there. Will follow up there and report back here when something of significance shows up.
I can confirm that this issue has been resolved in zwavejs2mqtt:1.4.0
and ha 2021.3.0b1
@robertsLando The issue with the first temperature sensor always being using for the climate entity in HA, instead of the one specified in [10-112-0-2] Sensor mode, is that a Home Assistant issue, or is there something zwavejs or zwavejs2mqtt can do about that? (See the expected behavior paragraph in the bug description for details)
Hass issue if you are using zwavejs integration. BTW you could use MQTT to discover that climate and manually fix it
@robertsLando I'm using the zwavejs HA integration, so I'll create an issue on the HA side referencing this one. Thanks for the clarification.
This issue has not seen any recent activity and was marked as "cannot fix ❌". Closing for housekeeping purposes... 🧹
Feel free to reopen if the issue persists.
Reopening so we have something to track the investigation.
@AlCalzone I saw this text in the 6.6.1 release notes: "Fixed the length validation in sequenced Security S0 Message Encapsulation commands" Should it have any impact on this issue?
No - that was purely about the payload length, not invalid nonces.
Hi,
Any news on this? I've tried to solve this matter with ThermoFloor for the 2020 version of the Heat-IT ZDim devices. They're flooding the network with Nonce's (I'll call them nonesenses :stuck_out_tongue_winking_eye:) and Meter CC reports. The Meter CC reports is sent often, even though there's no changes in the values - and particularly when they're turned off. I also have one 2019 version that is silent as a grave whenever it's off or no changes in values.
Today they wrote me that they had in co-operation with the lead developer of Z-Wave JS @AlCalzone found out that it's not possible to reproduce the problem regards to the «nonesenses» and that both the devices and the driver works correctly. :thinking:
They were in fact not able to reproduce this issue with the PC controller - I've seen the logs. But one of their support staff agreed to send me a Z-TRM3 so I can investigate with zwave-js. Will let you know when I know more. Good to know that the ZDim could also be affected.
That's great that they will. I guess they use the same logic across their devices. Could it be an idea to actually ask Silabs how they deal with it in PC controller? If they actually invalidate nonces as the specs say they should.
I don´t think they are identical. Both are thermostats, but different versions/models. HeatIt sent me Installers Manual for my product and it is not labeled
Z-TRM3
butFirmware 1.8
. I'm quite confused myself, because it is not possible to find any products on their web site labeled withTF 021
norTF 016
. However, I think that mapping to TF 016 in zwavejs should work fine.
Hi, I have one unmounted Z-TRM3. It's written TF 058
on the back of it. Ver. 2020-A
EDIT: It's also written in the quickguide that was in the package. 5430599 TF058
Version
Build/Run method
zwavejs2mqtt version: 1.2.1 zwavejs version: 6.4.0 Home Assistant Core (in Docker) 2021.2.3 Thermofloor Z-TRM3 thermostat firmware version 4.0.33 (config/devices/0x019b/z-trm3.json)
Describe the bug
My Z-TRM3 thermostat (climate entity) is no longer visible in Home Assistant. It was visible and working with version 1.1.1. The device that should have the climate entity is available, but all of the temperature sensors are shown as unavailable.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The climate entity should be detected and it should have the correct current setpoint and current temperature of the specified temperature sensor (based on the [10-112-0-2] Sensor mode).
This last thing about the correct temperature sensor being used has never worked, even in previous versions. It would always use the sensor [10-49-2-Air temperature] (which is the first in the list).
Additional context
When I start up zwavejs2mqtt 1.2.1 I get this message twice (I have two Z-TRM3 thermostats):
Unable to discover climate device, there is no valid temperature valueId
This seems to be related to https://github.com/zwave-js/node-zwave-js/issues/1184 which added compat values for CC 0x31 to the definition of z-trm3.json.
If I removed all of those compat additions to z-trm3.json from that issue, then all of the temperature sensors show up in Home Assistant, and the error message mentioned above is no longer shown, but the climate entity is still not detected.
I even went as far as fully re-interviewing the node, removing the device and all the entities (manually) in Home Assistant to ensure it was fully detected again. I also stopped HA and zwavejs2mqtt, let zwavejs2mqtt start up fully before I restarted HA, but still no dice.
My only solution to get the climate entity back to working was to downgrade zwavejs2mqtt to version 1.1.1.
I didn't see anything in the release notes between 1.1.1 and 1.2.1 that indicates that work was done on anything thermostat-related, so this might be related to something else causing action at a distance.
One last note of relevance; if I keep the compat changes (0x31 CC) made to z-trm3.json mentioned above, but only downgrade to 1.1.1, the climate entity shows up in HA, but the current temperature value is unavailable. This seems to indicate that here might be two (or potentially three, if we count the wrong temperature sensor being used) inter-related issues at play here.