Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
11.77k stars 1.64k forks source link

1.2.1 zigbee shepherd doesn't start #1235

Closed h4nc closed 5 years ago

h4nc commented 5 years ago

With the newest version my zigbee2mqtt doesn't work any more.

I get this in the logs:

zigbee2mqtt:info 3/11/2019, 7:53:43 AM Starting zigbee-shepherd
  zigbee2mqtt:error 3/11/2019, 7:53:50 AM Error while starting zigbee-shepherd!
  zigbee2mqtt:error 3/11/2019, 7:53:50 AM Press the reset button on the stick (the one closest to the USB) and start again
  zigbee2mqtt:error 3/11/2019, 7:53:50 AM Failed to start
    {"message":"request timeout","stack":"Error: request timeout\n    at CcZnp.<anonymous> (/zigbee2mqtt-1.2.1/node_modules/cc-znp/lib/ccznp.js:261:22)\n    at Object.onceWrapper (events.js:273:13)\n    at CcZnp.emit (events.js:182:13)\n    at Timeout.<anonymous> (/zigbee2mqtt-1.2.1/node_modules/cc-znp/lib/ccznp.js:240:18)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)"}
  zigbee2mqtt:error 3/11/2019, 7:53:50 AM Exiting...

I already pressed the reset button several times. The release notes say that you need node.js 10. I'm using hassio and I don't know which version of node I have, but I'm running the lassted relaease of hassos (2.10).

h4nc commented 5 years ago

@Koenkk I don't know what the issue was here (still interessted if you have a clue).

Things I did in the meanwhile:

Again this brought up the issue where all devices did not show a connection in the zigbee map. I was able so solve this this way:

I did both of those options, so I cant says if both are necessary or only one is ok.

EDIT: I want to add that the only one device that doesn't show I connection in the map now, way the one that I triggered (button on/off) before I did two steps above. I will have to press the pairbutton of the device while permit join is true to make it show it's connection again.

Koenkk commented 5 years ago

somehow the stick crashed, the reason for this varies, one could be the use of the reporting feature (this is already being worked on in a separate issue).

h4nc commented 5 years ago

I did not use the reporting feature yet.

So after a crash like this, the only thing you can do is reflash?

It stopped working when I updated to 1.2.1

Koenkk commented 5 years ago

Unfortunately, this indeed seems to be the case sometimes. If you can reproduce the crash we can try to fix it.

h4nc commented 5 years ago

I don’t think that I can reproduce it. But will keep my eyes open when it happens again.

i had that the first time now. It doesn’t bother me a lot, but it will bother people that buy ready to use sticks online. Those do not have a CC debugger.

I read somewhere that it already possible to flash without the debugger.

Is this possible for normal users that don’t want do use a debugger too? Or is using the debugger still the easiest method. What would you recommend people that have a crash and no Debugger?

Maybe we can work out an entry in the docs for that people.

Koenkk commented 5 years ago

It's possible (however I've never done this). I don't know if this is still possible with a crashed CC2531, we should try if we have a crashed CC2531.

h4nc commented 5 years ago

Do you know it’s done? Are there already people that use that method or is it only theoretical right now?

So people with a crashed device should ask for help here, so that can find out if you need to have a debugger for that case.

Koenkk commented 5 years ago

@kirovilya has done this AFAIK

kirovilya commented 5 years ago

@h4nc @Koenkk you can to try update firmware without ccdebugger. read this thread https://github.com/Koenkk/zigbee2mqtt/issues/320 but need more checks. I tried it a couple of times, but then I reflash it through ccdebuger anyway (not because of errors).

h4nc commented 5 years ago

@kirovilya there is a hex and a bin file here https://github.com/Koenkk/Z-Stack-firmware/tree/master/coordinator/CC2531/bin

So people that don't have a cc debugger will have to use the bin file from there, right? Than they download the software as described in the link a flash it direclty on the stick (that was flashed with a debugger one time before).

But we don't know if this will also work with a crashed cc2531.

kirovilya commented 5 years ago

So people that don't have a cc debugger will have to use the bin file from there, right?

yes

But we don't know if this will also work with a crashed cc2531.

And I don't know too...

h4nc commented 5 years ago

I guess we have to wait until someone has a crashed stick again and wants or has to (because no debugger) try this.

Another quick question. Will it be possible in a future update to update the firmware directly in zigbee2mqtt. Like just hit a "update button" and done? This would be the most user friendly version.

Koenkk commented 5 years ago

Theoretically this should be possible if the SBL protocol is implemented in zigbee2mqtt.

h4nc commented 5 years ago

Is it a future goal to be able to do this directly in zigbee2mqtt?

kirovilya commented 5 years ago

One of the problem - go chip to bootloader mode. I think this functionality should not be part of Z2M, but should be described separately.

h4nc commented 5 years ago

So you mean something like a second add-on in hassio.

I know hassio is by far not the only way to use zigbee2mqtt, but I think a lot users that are not able or don't want to flash with debugger or sth else (simple users) use something like hassio.

h4nc commented 5 years ago

@Koenkk already two more people with that issue in that thread

https://github.com/danielwelch/hassio-zigbee2mqtt/issues/118#issuecomment-472931484

Koenkk commented 5 years ago

Not sure, they could also be using the wrong port (same issue as the OP of that thread has).

martinrosenauer commented 5 years ago

@Koenkk, started getting the request timeout at CcZnp today on startup as well all of a sudden.

Tried resetting the CC2531, and it seems to let z2m restart most of the times, but then I see Error: request timeout towards bulbs for example. Restarting again and the CcZnp timeout shows up.

Tried update.sh to make sure I'm on the latest commit (#661f79c), but that doesn't change anything.

Tried reflashing with CC2531ZNP-Prod_20190223, but that doesn't change anything either.

I have been re-pairing a few Xioami devices (a button and a window sensor) today, but apart from that, no changes. During the restarts to try to trouble shoot I'm seeing No converter available for 'LED1623G12' with cid 'haDiagnostic', type 'devChange' and data

'{"cid":"haDiagnostic","data":{"averageMacRetryPerApsMessageSent":2,"lastMessageLqi":112,"lastMessageRssi":-72}}'

in the log. Will try to remove and re-pair this specific Trådfri bulb.

kianusch commented 5 years ago

Same problem here - reflashing fixed the problem ...

martinrosenauer commented 5 years ago

@kianusch, how long ago you reflashed ?

kianusch commented 5 years ago

@kianusch, how long ago you reflashed ?

Initially about two weeks ago? (And then an hour ago to fix the problem).

The "only" new change was, that yesterday I soldered an antenna jack connector to the CC2531 - after that it worked fine for a few hours - and than suddenly ...

martinrosenauer commented 5 years ago

@kianusch, ok, fingers crossed that it runs stable now then =) I'm still struggling ..

Koenkk commented 5 years ago

@martinrosenauer Please try with the following firmware: https://drive.google.com/open?id=1-xzI6b8umZFpki-pfaKdLgcPrUUlswe5

Relative to the 20190223 firmware, it has an increased memory heap at the cost of direct connected devices to the coordinator (15 -> 5).

martinrosenauer commented 5 years ago

@Koenkk , thanks a lot for trying to help out - I will give it a shot.

What I did was to turn off everything I could, left it for a while and powered up the entire thing again. So far it has been running for an hour without timeouts or errors, but I notice a few Xiaomi devices are not working. I'll re-pair those after flashing with the 20190315 firmware you send.

I also added another CC2531 with the router firmware between the floors of the house. I believe that could also make a difference.

While looking at the zigbee network map I see the ball-shaped topology which I assume is perfectly fine ? however, some of the routers (Ikea Trådfri bulbs) are marked as offline, but work. Is that behavior normal ?

Koenkk commented 5 years ago

@martinrosenauer the online/offline state is not reliably (known issue).

martinrosenauer commented 5 years ago

@Koenkk, ok, now on 20190315, anything in particular I should try ? :)

Koenkk commented 5 years ago

@martinrosenauer there is not really much to try, just wait and check if it keeps stable.

martinrosenauer commented 5 years ago

@Koenkk, ok, will keep you posted. Do you have any theory why these single devices stop working and have to be re-paired to get back in the loop?

martinrosenauer commented 5 years ago

@Koenkk, a bit of feedback on 20190315 - a few Ikea Trådfri bulbs started responding with error code 233, repairing and resetting the CC2531 seems to have helped. I lost contact with a lot of the Xiaomi window/motion sensors, and in the middle of repairing those as well.

A question - how should the ideal zigbee network map look ? - the round shape with everything connected ? (I currently see nodes that are not connected at all).

kianusch commented 5 years ago

@martinrosenauer ... I've made the experience, that sometimes it helps to:

Set "Permit join" -> allow and then manually remove the "Router" device with the most devices connected to. This will make most other device looking for the best connection... the "removed" device will automatically rejoin the zigbee network (you will have to set a friendly name again if you are using friendly names).

As for the network map - I don't think that the shape of the graphics has any relevance to the functionality ...

@Koenkk ... do zigbee devices constantly seek for the best possible route - or is it more or less static as long as it works somehow?

martinrosenauer commented 5 years ago

@kianusch, that could be a useful approach, I'll give a try.

In regards of the network map, I was not so concerned about the shape, but more curious if non-connected devices was a problem. I have a mix of powered router devices (bulbs) and battery driven sensors (Xiaomi window, vibration, motion, etc.), where I see some of them being non-connected according to the map.

kianusch commented 5 years ago

I believe - "Battery-Powered-End-Devices", usually fall in sleep mode - and have to be woken up at the correct time for the coordinator to show them connected in the map.

Koenkk commented 5 years ago

@kianusch it depends on the zigbee implementation of the device, it is known that e.g. xiaomi sensors are quite stubborn to search for new paths.

jesperldk commented 5 years ago

Hi

I am trying to set up a zigbee2mqtt for the first time.

Right from the beginning I got the ccznp timeout. I have tried to reflash with CC2531ZNP-Prod_20190223 several times and I have also tried with CC2531ZNP-Prod_20190315. I have tried with both report true and false.

I do feel certain that I am using the right port, and I have tried both with /dev/ttyACM0 and with the /dev/serial/by-id that indeed points to ttyACM0.

It is a bit strange, as sometimes it take something like 7 secs before timeout and sometimes it takes like 12 secs. I am completely at loss, and would appreciate any help to debug. For example, what does it mean that the green led lights when i plug it in, but turns off after a minute or so by it self. If I press the reset (closest to the usb), it turns off right away. Is there some simple test to see if I can talk to the stick?

Not really sure I am having the same issue as the others in this ticket, if not I am sorry for the hijack - I do however get the same timeout :-(

Thanks! Jesper

Koenkk commented 5 years ago

@jesperldk

It looks like a communication problem:

jesperldk commented 5 years ago

I am on a Pi Zero W that was set up a few weeks ago, and have been running some bluetooth scripts. The Zero is an Arm 6 so I could not use the guide in https://www.zigbee2mqtt.io/getting_started/running_zigbee2mqtt.html to insatall NodeJS, but succesfully installed using the guide in https://www.thepolyglotdeveloper.com/2018/03/install-nodejs-raspberry-pi-zero-w-nodesource/

I have tried what you suggest with no luck. Is there a simple way to check if I can talk to the stick?

Koenkk commented 5 years ago

@jesperldk can you check if it works on a different linux or mac system?

jesperldk commented 5 years ago

yeah, but not until next week. Thanks for now, I'll be back when I have tried it monday or so...

jesperldk commented 5 years ago

@Koenkk sorry, just one question: I have another zigbee setup, TRAADFRI hub with around 45 devices - do I need to do anything to make sure the two nets do not interfere? I have around 15 Aqara devices that I plan to use with zigbee2mqtt. I have chosen another channel than the default, but am not sure if I should change anything else.

kianusch commented 5 years ago

@jesperldk ... you could try using docker and the zigbee2mqtt docker image on your raspberry zero - would make things much easier.

jesperldk commented 5 years ago

@kianusch good suggestion. I have no experience with docker, but am aware that this ought to change ;-) Am fearing a bit that I'll run into other issues with limited ARM6 support, but I will give it a go. Do however doubt it will help with my problem of talking to the stick.

kianusch commented 5 years ago

Don't worry - simply install the docker-environment on your Zero via apt - although I believe that the latest Version might have a problem on Rasperry Zero (containers won't start) so you have to install an older version (check google for that) - and after that - the documented way to install zigbee2mqtt container (or other containers) is straight forward.

Koenkk commented 5 years ago

@jesperldk changing channel is enough.

jesperldk commented 5 years ago

@Koenkk Got a little unexpected time today and tried the stick on an Intel machine running Debian Stretch. Followed the installation guide with no hiccups at all, except I had to add my user to the dialout group to get access to ttyACM0. The result is exactly like on my Pi Zero:

@influx:/opt/zigbee2mqtt$ DEBUG=zigbee-shepherd* npm start
> zigbee2mqtt@1.2.1 start /opt/zigbee2mqtt
> node index.js
  zigbee2mqtt:info 3/22/2019, 5:59:50 PM Logging to directory: '/opt/zigbee2mqtt/data/log/2019-03-22.17-59-50'
  zigbee2mqtt:debug 3/22/2019, 5:59:50 PM Using zigbee-shepherd with settings: '{"net":{"panId":6754,"extPanId":[221,221,221,221,221,221,221,221],"channelList":[11],"precfgkey":"HIDDEN"},"dbPath":"/opt/zigbee2mqtt/data/database.db","sp":{"baudRate":115200,"rtscts":true}}'
  zigbee2mqtt:debug 3/22/2019, 5:59:50 PM Loaded state from file /opt/zigbee2mqtt/data/state.json
  zigbee2mqtt:debug 3/22/2019, 5:59:50 PM Saving state to file /opt/zigbee2mqtt/data/state.json
  zigbee2mqtt:info 3/22/2019, 5:59:50 PM Starting zigbee2mqtt version 1.2.1 (commit #4048cb8)
  zigbee2mqtt:info 3/22/2019, 5:59:50 PM Starting zigbee-shepherd
  zigbee-shepherd:init zigbee-shepherd booting... +0ms
  zigbee-shepherd:request REQ --> SYS:osalNvRead +0ms
  zigbee-shepherd:request RSP <-- SYS:osalNvRead +2s
  zigbee-shepherd:init Coordinator initialize had an error: Error: request timeout
    at CcZnp.<anonymous> (/opt/zigbee2mqtt/node_modules/cc-znp/lib/ccznp.js:261:22)
    at Object.onceWrapper (events.js:277:13)
    at CcZnp.emit (events.js:189:13)
    at Timeout.<anonymous> (/opt/zigbee2mqtt/node_modules/cc-znp/lib/ccznp.js:240:18)
    at ontimeout (timers.js:436:11)
    at tryOnTimeout (timers.js:300:5)
    at listOnTimeout (timers.js:263:5)
    at Timer.processTimers (timers.js:223:10) +0ms
  zigbee2mqtt:info 3/22/2019, 5:59:56 PM Error while starting zigbee-shepherd, attempting to fix... (takes 60 seconds)

I also tried the unplug/replug+press-and-led-turns-off-trick with no luck.

Koenkk commented 5 years ago

@jesperldk the last thing that you could try is reflashing, if this doesn't work, my guess is that your CC2531 is broken.

jesperldk commented 5 years ago

@Koenkk Reflash didn't help. I then took the CC2530 I had bought for router, flashed it as coordinator and connected it with an old FTDI 1232 I had laying around. Worked at first try :-D So my CC2531 is considered dead and I have ordered a new one. Thanks a lot for all the help!! 🥇

Michaelnorge commented 5 years ago

Same here.

Using CC2531 like a coordiniator on an ioBroker with Pi3. After restart the log is showing "Error while starting Error request timeout" and the light on the usb is turned off. I need to restart the Raspberry three times to get it in work again.

That´s strange...

kianusch commented 5 years ago

@Koenkk

Overnight my CC2531 firmware crashed (again) - currently the stick is in the crahed state - before I reflash the stick - is there anything I can do to help finding the problem?

I had 13 bulbs (10 Tradfri, 3 Philip Hue) in my network, 2 end-devices and one CC2531 with router-firmware.

The coordinator was running with the latest "standard" firmware.

(I'm using my second CC2531 with MAX-Stability-FW right now).

Koenkk commented 5 years ago

@kianusch probably because it runs out of memory, the max stability firmware has more, please let me know if it also crashes with that one.