Open tube0013 opened 1 year ago
@onedr0p please use this new issue.
Can you confirm z2m is starting? The most recent log snippet would appear to be from a started z2m instance.
Yes, z2m is started but none of my devices are responding to actions in z2m
Need more information then.
Z2M starting rules out a connection issue with the coordinator.
Let me know what info you need.
Device: tubeszb-cc2652-poe Z Firmware: CC1352P2_CC2652P_launchpad_coordinator_20221226 ESP Firmware: https://github.com/tube0013/tube_gateways/blob/main/models/retired/tubeszb-cc2652-poe/firmware/esphome/tubeszb-cc2652-poe.bin z2m version: 1.30.1
Please run an energy scan to check for interference.
https://github.com/zigpy/zigpy-cli
The previous way to do this has sold docs on setting up on any platform.
https://github.com/zigpy/zigpy-znp/blob/dev/TOOLS.md
To run the scan room z2m, and it will be a command line like this
''' zigpy radio znp /dev/cu.usbserial-1420 energy-scan '''
I have my zigbee network on ZIGBEE channel 11 and wifi on WIFI channel 11, which according to this website is on the opposite sides of the spectrum.
Which makes since according this output.
zigpy radio znp socket://192.168.1.90:6638 energy-scan
Channel energy (mean of 5 / 5):
------------------------------------------------
+ Lower energy is better
+ Active Zigbee networks on a channel may still cause congestion
+ TX on 26 in North America may be with lower power due to regulations
+ Zigbee channels 15, 20, 25 fall between WiFi channels 1, 6, 11
+ Some Zigbee devices only join networks on channels 15, 20, and 25
+ Current channel is enclosed in [square brackets]
------------------------------------------------
- [11 ] 5.65% #####
- 12 14.35% ##############
- 13 19.84% ###################
- 14 17.49% #################
- 15 43.61% ###########################################
- 16 14.20% ##############
- 17 14.67% ##############
- 18 22.67% ######################
- 19 8.55% ########
- 20 19.53% ###################
- 21 32.94% ################################
- 22 57.57% #########################################################
- 23 51.84% ###################################################
- 24 30.90% ##############################
- 25 0.00%
- 26* 46.12% ##############################################
It is worth noting that nothing else changed besides me updating the firmware on the device to try and avoid the problem of the tubezb controller locking up (loosing network) as reported in the other issue.
✖ /opt/homebrew/bin/python3.11 ./cc2538-bsl.py -p socket://192.168.1.90:6638 -evw ../CC1352P2_CC2652P_launchpad_coordinator_20221226.hex
Opening port socket://192.168.1.90:6638, baud 500000
Reading data from ../CC1352P2_CC2652P_launchpad_coordinator_20221226.hex
Your firmware looks like an Intel Hex file
Connecting to target...
CC1350 PG2.0 (7x7mm): 352KB Flash, 20KB SRAM, CCFG.BL_CONFIG at 0x00057FD8
Primary IEEE Address: 00:12:4B:00:21:B4:97:5C
Performing mass erase
Erasing all main bank flash sectors
Erase done
Writing 360448 bytes starting at address 0x00000000
Write 104 bytes at 0x00057F980
Write done
Verifying by comparing CRC32 calculations.
Verified (match: 0xa9dc145d)
fair enough thanks.
Did you have a network back up before you did the fw update? (it shold be auto create by z2m) if it wasn't there or valid, when z2m restarted after the fw update it would create a whole new network, which could explain why nothing is working.
Hm, not sure I am following. I have backups for at least 14 days, are you saying I should try and restore one?
Firmware updates erase the network settings off the coordinator.
Z2M after a successful start creates a backup within the config folder. if this was not there when z2m did start following the update it is possible it will have over wrote the previous network settings.
14 days of back ups I'm assuming you mean full HA backups. you could look in a recent one specifically for the *backup.json and compare the current setting so the ones in the backup.
you can get the current settings by running a back up with the zigpy-cli
It's odd that the behavior of devices appearing online and updating the last seen? But nothing gets updated on the devices state like open/close etc..
No.
something else must be going on. But Z2M would not show those devices as online if it wasn't able to talk to them. Is Z2M staying connected to the coordinator? ie not doing a lot of restarts?
There have been no restarts on z2m since 50 minutes ago when I started it.
Are you able to pair a device?
I have not tried and I will have to wait until tomorrow to attempt it.
In the meantime I tried downgrading to CC1352P2_CC2652P_launchpad_coordinator_20220219
and the behavior is still the same.
I am sorry to say but these issues appear to be all related
https://github.com/tube0013/tube_gateways/issues/128 https://github.com/tube0013/tube_gateways/issues/133
you can try rolling back the fw, last esphome release binary is still on github.
I don't believe your issue is related to #128 at unless z2m starts disconnecting from the coordinator.
Which bin on this page should I use to revert back (there's 2)?
I don't believe your issue is related to https://github.com/tube0013/tube_gateways/issues/128 at unless z2m starts disconnecting from the coordinator.
I was having this issue before I touched updating any firmware...
https://github.com/tube0013/tube_gateways/issues/128#issuecomment-1423307398
tube-zb-gw-cc2652p2-poe_olimex_round_latest.bin
My experience has been better with the newest versions, but this is worth a shot.
I'm happy to troubleshoot, I have a 1st gen poe device in a test network. and it's working great there.
Happy to continue to troubleshoot. Also fell fee to reach out to me on email store @ tubeszb.com for replacement options, in cases like these I do offer good discounts.
Frankly I am not sure what is happening anymore, I downgraded both firmwares, restored a backup and nothing seems to work :/
I'd try to pair a device. if z2m is starting, it's able to talk to the coordinator. it won't start otherwise.
I tried to update the firmware one more time and z2m won't even start...
Using '/data' as data directory
Zigbee2MQTT:debug 2023-02-28 09:26:54: Loaded state from file /data/state.json
Zigbee2MQTT:info 2023-02-28 09:26:54: Logging to console only'
Zigbee2MQTT:info 2023-02-28 09:26:54: Starting Zigbee2MQTT version 1.30.1 (commit #eb878d3)
Zigbee2MQTT:info 2023-02-28 09:26:54: Starting zigbee-herdsman (0.14.89)
Zigbee2MQTT:debug 2023-02-28 09:26:54: Using zigbee-herdsman with settings: '{"adapter":{"concurrent":null,"delay":null,"disableLED":false},"backupPath":"/data/coordinator_backup.json","databaseBackupPath":"/data/database.db.backup","databasePath":"/data/database.db","network":{"channelList":[11],"extendedPanID":[221,221,221,221,221,221,221,221],"networkKey":"HIDDEN","panID":62840},"serialPort":{"path":"tcp://192.168.1.90:6638"}}'
Zigbee2MQTT:debug 2023-02-28 09:29:22: Saving state to file /data/state.json
Zigbee2MQTT:error 2023-02-28 09:29:22: Not connected to MQTT server!
Zigbee2MQTT:error 2023-02-28 09:29:22: Cannot send message: topic: 'zigbee2mqtt/bridge/state', payload: '{"state":"offline"}
Zigbee2MQTT:info 2023-02-28 09:29:22: Disconnecting from MQTT server
TypeError: Cannot read properties of undefined (reading 'end')
at MQTT.disconnect (/app/lib/mqtt.ts:129:21)
at Controller.stop (/app/lib/controller.ts:198:9)
at stop (/app/index.js:111:5)
at process.handleQuit (/app/index.js:117:9)
can you try and run an eneryg-scan again, but just let it run for a while.
this opens the connection to the coordinator and starts the network and is in constant reading/writing to the coordinator telling it to scan and reading the results. see if it runs for a while or if it disconnects/crashes.
❯ zigpy radio znp socket://192.168.1.90:6638 energy-scan
2023-02-28 10:06:04.552 Devins-MacBook-Pro.local asyncio WARNING socket.send() raised exception.
2023-02-28 10:06:05.553 Devins-MacBook-Pro.local asyncio WARNING socket.send() raised exception.
2023-02-28 10:06:06.555 Devins-MacBook-Pro.local asyncio WARNING socket.send() raised exception.
2023-02-28 10:06:07.556 Devins-MacBook-Pro.local asyncio WARNING socket.send() raised exception.
2023-02-28 10:06:08.558 Devins-MacBook-Pro.local asyncio WARNING socket.send() raised exception.
2023-02-28 10:06:09.093 Devins-MacBook-Pro.local zigpy.application ERROR Couldn't start application
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/api.py", line 665, in _skip_bootloader
result = await responses.get()
^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/queues.py", line 158, in get
await getter
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/zigpy/application.py", line 145, in startup
await self.connect()
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/zigbee/application.py", line 107, in connect
await znp.connect()
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/api.py", line 707, in connect
self.capabilities = (await self._skip_bootloader()).Capabilities
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/api.py", line 664, in _skip_bootloader
async with async_timeout.timeout(CONNECT_PROBE_TIMEOUT):
File "/opt/homebrew/lib/python3.11/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/opt/homebrew/lib/python3.11/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
TimeoutError
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/api.py", line 665, in _skip_bootloader
result = await responses.get()
^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/queues.py", line 158, in get
await getter
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/bin/zigpy", line 8, in <module>
sys.exit(cli())
^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/click/decorators.py", line 38, in new_func
return f(get_current_context().obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_cli/cli.py", line 20, in inner
return loop.run_until_complete(cmd(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_cli/radio.py", line 178, in energy_scan
await app.startup()
File "/opt/homebrew/lib/python3.11/site-packages/zigpy/application.py", line 145, in startup
await self.connect()
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/zigbee/application.py", line 107, in connect
await znp.connect()
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/api.py", line 707, in connect
self.capabilities = (await self._skip_bootloader()).Capabilities
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/zigpy_znp/api.py", line 664, in _skip_bootloader
async with async_timeout.timeout(CONNECT_PROBE_TIMEOUT):
File "/opt/homebrew/lib/python3.11/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/opt/homebrew/lib/python3.11/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
TimeoutError
this is with the rolled back ESPhome fw?
is the coordinator accessible over the web when this happens?
This is with the latest esphome FW, and yes I can access it via the web.
I shut down z2m and restarted the controller and now I can run the zigpy energy-scan command.
oh yeah that would explain that, only one thing can connect at a time.
as a test I'd let that run as long as you can reasonably let it. I've run them straight for multiple days to test things out but I'm just trying to get a baseline for if there are connection issues, or if this is more of a zigbee/z2m issue.
I thought zigpy radio znp socket://192.168.1.90:6638 energy-scan
is just reporting stats on channels or is it actively doing something else?
so it's running a scan over and over again. the 1/5 - 5/5 are an average of the last 5 scans. so it's a telling the coordinator to scan and also receiving the results of the scan. and it does this over and over again.
So tests the connection with the coordinator and also the results test the RF of the radio, this is why I run this as a QC test so I can spot issues with the 2 main functions of the device. https://twitter.com/TubesZb/status/1619195503358668801?s=20
I had to stop it, but it ran for nearly 3 hours
that to me shows that the device components are working as they should and this may be more of a z2m issue. I am not ruling out possible disconnections with longer tests or z2m connections, but the 3 hour test is pretty good. With questionable hardware fw, I've had them stop after only a few minutes.
Sticking with the zigpy-cli tools, you can run a remote energy scan on any router on your mesh by specifying --nwk 0x1234 at the end of the command where 0x1234 is the nwk address of the router. this would test communication to a device on your mesh network.
I am pretty sure at this point I am going to have to nuke my zigbee network and re-pair my devices, it's not too much effort as I have ~20 devices but annoying all the same. It is a shame because your device lasted great for a year.
I've moved to another zigbee controller, you can close this as there isn't anything left to debug. The 2021 poe device must be faulty or something.
@tube0013 I ran a similar setup to onedr0p with the same coordinator (he recommended it and I got it).
It looks like mine started failing this week. I tried to update ESPHome and the firmware, it runs for a little bit but then locks up completely.
None of the actions are functioning, you can still push an OTA Update and it comes back to life for a bit.
Well I did update both esphome firmware and the controller firmware and now all my devices are not working. I have tried restarting the PoE controller and zigbee2mqtt a few times :/
_Originally posted by @onedr0p in https://github.com/tube0013/tube_gateways/issues/128#issuecomment-1447378067_