Closed justindthomas closed 1 year ago
Do you have a zniffer available? Or a 500-stick lying around you're willing to permanently convert to one?
I don't have a zniffer, but I do have a Zooz ZST10 that I've never opened. I'd be fine with converting that.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Saturday, January 29th, 2022 at 11:51 AM, blhoward2 @.***> wrote:
Do you have a zniffer available? Or a 500-stick lying around you're willing to permanently convert to one?
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>
It would greatly help in diagnosing this. You need windows available and the process is a bit involved.
Sure, just let me know what you need me to do. I avoid Windows when I can, but I can use one of our gaming PCs to get it done.
Sent from ProtonMail mobile
-------- Original Message -------- On Jan 29, 2022, 12:01 PM, blhoward2 wrote:
It would greatly help in diagnosing this. You need windows available and the process is a bit involved.
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>
@kpine Did you ever make a zniffer conversion guide?
The instructions are generally here: https://www.silabs.com/documents/public/user-guides/INS10249-Z-Wave-Zniffer-User-Guide.pdf. I did find the fw with the help of SiLabs. It is now located on GitHub in a stupid place: https://github.com/SiliconLabs/gecko_sdk/releases Under demo_applications.zip, under the z-wave protocol folder.
No, I don't have a guide, but the Silicon Labs guide is here. It assumes the target controller uses the same chip as the UZB3, not sure if that's true with the Zooz or not. I can say my Aeotec Z-Stick was a terrible Zniffer, night and day difference between it and the UZB3.
I don't see any 500-series zniffer firmware in the demo_applications.zip file. Do you know which file it's supposed to be? All I see are 700-series fw which only work with their dev board.
It’s under the zwave protocol folder, or at least it was two weeks ago. I'm actually having trouble finding it now but I know it was there, I went back to the email they sent me.
Strangely, lsusb
reports the Zooz sick under the Aeotec description:
idVendor 0x0658 Sigma Designs, Inc.
idProduct 0x0200 Aeotec Z-Stick Gen5 (ZW090) - UZB
It does for my reference UZB3 from SiLabs too
I'm actually having trouble finding it now but I know it was there, I went back to the email they sent me.
What was the filename they told you?
sniffer_ZW050x_USBVCP.hex
I successfully converted a UZB3 at the time so clearly I found it somewhere there.
Follow this guide: https://3.5inchfloppy.com/using-the-acc-uzb-u-sta-z-wave-controller-with-zniffer/
The link to download Zniffer is valid.
They just updated the guide this week, so perhaps they moved the firmware back so it is installed with the zniffer software as it used to be:
Exciting!
So we know, where did you find the firmware?
I used the guide that @kpine posted and downloaded the files as directed there. It's included as specified there in the relative link in the downloaded file from https://www.silabs.com/development-tools/wireless/z-wave/embedded-development-kit.
Now it is time to flash the Zniffer firmware. From the Flash Code Memory tab, select the hex file located at: \Zniffer_v4_57\Z-Wave_Firmware\sniffer_ZW050x_USBVCP.hex
The only hiccup I had was that SiLabs really wanted me to download the SDK first before letting me download the programmer or zniffer, so I had to capitulate to that.
Ok. They must have fixed it after I asked about it. In recent releases the firmware wasn’t distributed at all and that folder didn’t exist.
That ZEN32 node (85) was marked dead again during a heal process about an hour ago.
Definitely a lot going on right there. It looks like that might have been around the time that I was fiddling with the sensor in my son's room (a ZSE18) trying to get it to wake up to heal. That device's battery is now showing near 0, which is odd (it wasn't anywhere near there earlier), so that could play in to it.
I can almost buy the argument that the network is just busy and that congestion is causing problems. But the fact of the matter is that I only ever saw nodes just up and die when I was using the 700 device before I switched back to the 500. Then I saw none for the duration of using the migrated 500 device (a few weeks/couple of months) through countless heal operations, inclusions, exclusions, re-inclusions to move from unauth to S2, etc. I was quite busy - never a dead node.
Then switching back to the upgraded 700 device, I see dead nodes again. It's definitely better than it was - I've only seen it happen with this one ZEN32 and I was seeing a lot more problems before. But it seems like the problem is not fully solved.
If you're able to reproduce it somehow, having a driver log and a Zniffer log of the node going "dead" would be interesting.
Sounds good. I'll work on figuring out a way to do that. Maybe just letting the zniffer run for 8 hours at a time or something like that.
That’s really the only way. I don’t have mine running but I believe it can export versus a screen shot
I assume I can export a capture file like tcpdump or Wireshark. I'll work on becoming more familiar with it.
I wonder if anyone has worked on building a CLI tool like tcpdump to use the hardware or if that would be a licensing problem. It would sure be handy.
Some people were working on one before the spec was released publicly. It seems that once the zniffer software was available no one bothered. Zwave.me has a custom zniffer setup in their new devices at least.
We’ve flirted with if there was a way to put the stick in promiscuous mode (like a zniffer but your network only, doesn’t require new fw)
We’ve flirted with if there was a way to put the stick in promiscuous mode (like a zniffer but your network only, doesn’t require new fw)
It's been a little bit since I've seen a node die unexpectedly. I'm going to go ahead and close this and I'll reopen with trace data if it recurs.
Reopening to look at this log:
@AlCalzone this is the one I mentioned on the META ticket. There is indeed a little bit more context in that conversation that isn't prefaced by [Node 055]
and was missed in that previous grep.
2022-02-03T19:47:43.427Z CNTRLR [Node 087] The node is alive.
2022-02-03T19:47:43.518Z CNTRLR [Node 087] The node is ready to be used
2022-02-03T19:47:43.518Z CNTRLR All nodes are ready to be used
2022-02-03T19:47:43.534Z CNTRLR « [Node 087] ping successful
2022-02-03T19:47:43.538Z SERIAL » 0x01050051373aa6 (7 bytes)
2022-02-03T19:47:43.539Z DRIVER » [Node 055] [REQ] [AssignSUCReturnRoute]
2022-02-03T19:47:43.546Z SERIAL « [ACK] (0x06)
2022-02-03T19:47:43.566Z SERIAL « 0x0104015101aa (6 bytes)
2022-02-03T19:47:43.568Z SERIAL » [ACK] (0x06)
2022-02-03T19:47:43.569Z DRIVER « [RES] [AssignSUCReturnRoute]
2022-02-03T19:47:43.575Z SERIAL « 0x012100513a0400000000000000000000000000000000000000000000000000000 (35 bytes)
2022-02-03T19:47:43.576Z SERIAL » [ACK] (0x06)
2022-02-03T19:47:43.577Z DRIVER « [REQ] [AssignSUCReturnRoute]
2022-02-03T19:47:43.582Z CNTRLR [Node 055] The node did not respond, it is presumed dead
2022-02-03T19:47:43.584Z CNTRLR [Node 055] The node is now dead.
2022-02-03T19:47:46.337Z DRIVER Usage statistics sent - next transmission scheduled in 23 hours.
This was on startup, so the system was busy pinging all of the nodes. Node 55 was identified as alive in that process, and then dead immediately after.
Unfortunately didn't have my zniffer on, but I'm setting that up now.
EDIT: I ran a network heal with the zniffer running to see if anything died. Nothing did, but that node (55) refused to heal. All the others worked fine. I tried healing it individually and it still failed.
So I re-interviewed it and that seems to have helped. It healed on this last attempt - maybe the server just had bad info stored. I'll keep the zniffer running and report back.
@justindthomas This still isn't a complete log. Many logs are multiline, including the DRIVER
and SERIAL
ones, so this is missing information. Please no grep
, just copy the relevant timeframe or post the entire thing.
Ah, yes. I overlooked that. Here you go.
That LZW31 (node 55) was in the wrong place in the zwavejs2mqtt list this morning and I noticed that it had a "-" in the "Zwave+" column. So I interviewed it again and it's registering correctly. But now it's refusing to heal.
Here's the current log with that heal attempt.
Node 85 (a ZEN32) just died. Here's the updated log.
I did leave the zniffer running and thought I'd caught it, but apparently the system pauses the capture when the screen locks (I don't think it goes to sleep, because I have a HyperV VM running on that host that doesn't stop responding). Anyway, I need to fix that.
So... first things first:
The "dead" in the first log is actually the SUC return route that fails to be assigned. I guess that shouldn't count as "the node is dead" really. Raised https://github.com/zwave-js/node-zwave-js/issues/4190
2022-02-03T19:47:43.577Z DRIVER « [REQ] [AssignSUCReturnRoute]
callback id: 58
transmit status: NoRoute
I interviewed it again and it's registering correctly. But now it's refusing to heal.
Looks like it won't update its neighbors. Any chance you got a Zniffer trace of that? I've raised https://github.com/zwave-js/node-zwave-js/issues/4192 for this, maybe we can at least update routes in that case.
Node 85 (a ZEN32) just died. Here's the updated log.
Oh no!
2022-02-04T02:32:44.834Z DRIVER « [REQ] [SendDataBridge]
callback id: 98
transmit status: Fail, took 0 ms
routing attempts: 0
protocol & route speed: Unknown (0x00)
This looks like the 700 series issue all over, but this time it actually looks like it is correct. When this happens (12x in the last log), one or two of your nodes (40, 44, 48, 68, ...) are flooding the network. Although that could just be a symptom - those Zniffer logs would come in really handy this time.
Anyways, I've raised https://github.com/zwave-js/node-zwave-js/issues/4191 to NOT treat this as a dead node, but rather a jammed controller.
Thanks. I've restarted my zniffer and adjusted my power settings so hopefully it will be running when that occurs again.
For your reference, those nodes are Inovelli LZW31, LZW30-SN, and Zooz ZSE18 devices. 68 is a Zooz ZSE18 - the flooding may have been when I was waking it up to heal it (it's kind of a pain in the neck to wake up and I have to try numerous times - lots of pushing its single button).
I re-read your question again and realize I totally do have the zniffer log for that network heal where node 55 refused to heal. I started it at the beginning of the heal and stopped at the end. Here it is.
That should line up with the data in zwavejs_2022-02-04 starting at 00:13:15 (I'm UTC-8 and the zniffer system is running on local time).
I notice there are quite a few CRC_ERRORs in the zniffer log when the controller tries to get 55's neighbors:
here's 49 for comparison:
The CRC_ERRORs appear frequently, so I'm not sure if that's just the Zniffer picking up some weird signal. Its just that it seems to happen every time the controller tries to directly communicate with 55, so there must be something to it...
I have a couple of outdoor ZSE41 and ZSE43 sensors that I suspect might be causing those errors. They were paired to my 500 stick after I migrated back from the 700 and I haven't reset and reincluded them to the 700 stick since upgrading its firmware and putting it back in service.
So they have the right home ID and everything, the stick just doesn't know what to do with them yet. One is physically beyond node 55 and I see now how that might be a problem. I'll reset those today and see if it reduces the CRC errors.
Got those ZSE devices sorted out and back in line, but that didn't seem to stem the CRC errors.
It looks like the RSSI values for the CRC error signals are significantly stronger than average. Does that indicate that the source device is quite close to the zniffer?
I went a little crazy running around, killing power to anything that looked like it might be involved in those CRC errors (i.e., watching for when they popped up and then turning off any nodes with traffic within a few milliseconds of the error). But the errors don't stop - I suspect they may be an artifact of the Zniffer itself.
It looks like the RSSI values for the CRC error signals are significantly stronger than average
I think its just zeroed out as "no value".
I suspect they may be an artifact of the Zniffer itself.
Might be, can you try a different placement? Maybe on an extension cord?
This issue has not seen any recent activity and was marked as "stale 💤". Closing for housekeeping purposes... 🧹
Feel free to reopen if the issue persists.
Is your problem within Home Assistant (Core or Z-Wave JS Integration)?
NO, my problem is NOT within Home Assistant or the ZWave JS integration
Is your problem within ZWaveJS2MQTT?
NO, my problem is NOT within ZWaveJS2MQTT
Checklist
[X] I have checked the troubleshooting section and my problem is not described there.
[X] I have read the changelog and my problem was not mentioned there.
Describe the bug
I upgraded my Aeotec Z-Stick 7 yesterday. This morning, I noticed one of my automations did not trigger a light switch as it should have (a ZEN32). The logs indicate that a LZW31 device had a sudden burst of traffic and then the ZEN32 device was marked dead.
The death of node 85 (ZEN32) with 10 minutes for context: dead_node_85.log
The full activity of node 42 (LZW31) in that log file: node_42.log
Per the recommendation of @blhoward2, I ran a network heal. That was mostly successful with 68/69 nodes healing successfully. One did not heal - that was a different LZW31 (node 55): node_55.log
Another LZW31 was marked dead. node_31.log
I have 9 LZW31 devices (out of 69 total Zwave nodes). All are running firmware 1.57.
Device information
Manufacturer: Inovelli Model name: LZW31 Node ID in your network: 31, 42, 55
How are you using
node-zwave-js
?zwavejs2mqtt
Docker image (latest)zwavejs2mqtt
Docker image (dev)zwavejs2mqtt
Docker manually built (please specify branches)ioBroker.zwave2
adapter (please specify version)HomeAssistant zwave_js
integration (please specify version)pkg
node-red-contrib-zwave-js
(please specify version, double click node to find out)Which branches or versions?
version:
node-zwave-js
branch: master (8.11.1)zwavejs2mqtt
branch: master (6.4.1)Did you change anything?
no
If yes, what did you change?
No response
Did this work before?
Yes (please describe)
If yes, where did it work?
No problems on the Aeotec Z-Stick 5+ for the last couple of months.
Attach Driver Logfile
zwavejs_current.log