Closed tslivnik closed 3 years ago
Is there anything in the openHAB log when this happens? On TRACE level you should see the request openHAB sends on the command and also immediately afterwards the response from the REST API plug-in. This could give some insight what‘s going on.
Edit: Which version of the openHAB binding are you using?
Restarting deconz makes no difference. Rebooting the system appears to cure the problem
You mean restarting deconz change nothing but restarting deconz + openHAB do something ? Have you tried using phoscon if the device is visible on it ? Have you try just with unplug / replug USB device ?
Restarting deconz makes no difference. Rebooting the system appears to cure the problem
You mean restarting deconz change nothing but restarting deconz + openHAB do something ? Have you tried using phoscon if the device is visible on it ? Have you try just with unplug / replug USB device ?
Restarting /usr/bin/deCONZ does not cure the problem, but rebooting the virtual machine seems to. I don't think restarting openHAB will make any difference, since controls also don't work if I access the Phoscon gateway via http and try to switch the smart plug on and off from there. I have not tried unplugging and replugging the USB device. I'd like to understand what the problem is and solve it. Rebooting is not a solution, and unplugging hardware is even worse, since the system needs to continue working and controlling heating etc. when no one has physical access to the hardware (e.g. when travelling).
Is there anything in the openHAB log when this happens? On TRACE level you should see the request openHAB sends on the command and also immediately afterwards the response from the REST API plug-in. This could give some insight what‘s going on.
Edit: Which version of the openHAB binding are you using?
There are no error messages or anything unusual at the relevant time in /var/log/openhab2/openhab.log .
I cannot find anything unusual in /var/log/openhab2/events.log*, but those logs are very large and I'm not sure what I'm looking for.
/var/log/openhab2/audit.log is empty.
I have deconz binding version 2.5.9 in OpenHAB.
Thanks.
No, It's just because I was not sure wheat mean "restarting" for you. And in some situation a reboot can restart the conbee and on other not (depend on USB powering, was for that the unplug/replug), but if you just restart the virtual machine, you are right for me, nothing to see with the hardware.
So you say, as long as you can control the actuator from openHAB, you can also control it from the Phoscon Software and as soon as the control from openHAB stops working, you can‘t control from Phoscon either?
And then after rebooting the deconz container, openHAB reconnects and it starts working again? In that case I doubt that it has anything to do with openHAB.
So you say, as long as you can control the actuator from openHAB, you can also control it from the Phoscon Software and as soon as the control from openHAB stops working, you can‘t control from Phoscon either?
And then after rebooting the deconz container, openHAB reconnects and it starts working again? In that case I doubt that it has anything to do with openHAB.
So, the truth is, I only ever use openHAB to control the actuators. When this fails, I log into Phoscon, and then at that point, invariably I find that I also cannot control it from Phoscon. After I reboot the VM, it all seems to work again - openHAB can control the actuators, and if I try at that point, I can also do so from Phoscon. But then I do not use Phoscon any more, so I cannot claim that whenever I can control the actuators from openHAB, I can also do so from Phoscon - although I believe that most likely this is true.
Also, I should add, I have only had this problem occur twice. The first time, I was trying to debug the issue and rebooted the VM (actually, crashed it) accidentally, and found that everything was working fine after reboot. The second time is now. I don't want to reboot the VM while I am in this error state so I can try to identify what the problem is. I do not know how to reproduce the problem on purpose: it just happens after several days of running. I have tried restarting deconz, and the problem persists. I have not tried rebooting a second time. It is theoretically possible that the first time I was just lucky and reboot cured the issue, though if I had to bet, I'd say the reboot will cure the issue again, but then I won't be in a position to try to debug the issue while the problem is persisting. I'm happy to run tests or collect logs while the system is in this state if that will help us track down the problem.
I would be grateful for any guidance as to what information to extract from the logs and provide to you, or what tests to run. I have not rebooted the system since it started exhibited this behaviour, which means my home automation has been down for the past 7 days. Tomorrow, I will reboot, at which point I will probably not be able to run any tests any more as the system will likely work normally again (until the problem reappears), although I will preserve the logs. Thanks. Happy New Year.
I rebooted my system, which cleared the problem. For now, the solution I have adopted is to run /sbin/shutdown -r now
via the crontab twice a week. We'll see if this solves the problem permanently. It's not a good solution though. There clearly remains a bug in deconz.
As there has not been any response in 28 days, this issue will be closed. @ OP: If this issue is solved post what fixed it for you. If it is not solved, request to get this opened again.
The issue may be stale but it is not solved. I am not sure what the process is to request to keep it open. Thanks.
I'll Re-Open for now.
@J-N-K @Smanar Can you provide some guidance?
@Mimiix As I said before: if both (openHAB and Phoscon) either work or don’t work, I doubt the issue is in openHAB or Phoscon but in the common part.
Probably the REST plugin is not able to control the ZigBee network at that time and therefore both fail. Another indicator for that is, that rebooting the deconz container brings both back online. I have next-to-no knowledge how the rest-API-Plugin works internally, so I can‘t help much here.
Maybe it‘s not a deconz issue at all but the virtual machine fails to provide USB communication. I have seen a onewire USB adapter (DS9490R) failing in a VirtualBox VM from time to time and never figured out why. It was still listed as connected but unresponsive. It was fixed by a reboot. I guess that an unresponsive ConBee stick would be similar and could create the described behavior. But this is just guessing.
@J-N-K This gives me some pointers! Let's check on a zigbee level.
@tslivnik I'd like to have some logs on whats going on.
To enable logging. Open deCONZ and click Help. After that, click Debug View. The following debug levels need to be enabled for proper logging: INFO, ERROR,ERROR_L2,APS,APS_L2.
As there has not been any response in 21 days, this issue has been automatically marked as stale. At OP: Please either close this issue or keep it active It will be closed in 7 days if no further activity occurs.
As there has not been any response in 28 days, this issue will be closed. @ OP: If this issue is solved post what fixed it for you. If it is not solved, request to get this opened again.
The issue has not been solved. The issue seems to occur randomly after about a week of running time. When I enable all the logging flags (I ran /usr/bin/deCONZ -platform minimal --http-port-80 --dbg-info=2 --dbg-error=2 --dbg-aps=2 --dbg-ota=1 --dbg-prot=2 --dbg-wire=1 --dbg-zdp=1 --dbg-zcl=1 --dbg-zcldb=1 --dbg-http=1 --dbg-tlink=1 >~/deconz-log 2>&1 &
, the logs grow extremely rapidly and would fill the virtual disk in my virtual machine before the problem would be triggered). I work around the issue by rebooting my VM every night at 4am. This mostly avoids the issue, but from time to time, the actuators still stop working. I haven't checked why, I just restarted the VM every time it happened. Basically I've got so many other priorities I have stopped looking at using deCONZ as a part of my home automation solution because of this problem. I'd be happy to run the system with logs enabled, but I need to limit the amount of logging options as logging everything produces too logs which are too large. Which options in the above command line can I take out? Thanks.
I've only asked for 3 parameters..
Besides, if you feel deconz it at fault, but don't want to help to fix it : I can't do anything.
Which three? dbg-info? dbg-error? dbg-aps? dbg-ota? dbg-prot? dbg-wire? dbg-zdp? dbp-zcl? dbg-zcldb? dbg-http? dbg-tlink? You haven't specified what you want anywhere. I have asked which ones and you answered not with information but with a snide comment. Deconz is certainly at fault because everything else is working in my home automation VM. If all you know to do is accuse people who report bugs of "not wanting to help", I am happy to stay where I am, which is having concluded that DeConz is not fit for purpose and that it is not properly supported. I reported the bug 4 months ago and I've only been ignored.
Can you make the try on a real machine, with same database, without logging ? (you can connect it too to OpenHab) When actuator stop working, have you tried to use them using Phoscon ? No error message as return when trying to use them when blocked ?
Sorry, I don't have any suitable available real machine. I haven't tried this for a while (I've given up on using deconz/ConBee II, and switched everything to Z-Wave for now and plan to buy a different Zigbee USB stick when I have the time to research the market, which isn't now), but looking at my own comments from earlier on, it does seem that Phoscon also is unable to control the actuators when this happens. I don't remember any error message, I believe (if I recall correctly) the actuators, from within OpenHAB, accept the command and move to the new state, but the physical actuators don't do anything, and the state then flips back to the original state. But this may be wrong - I can take out the 4am reboot and wait for the problem to re-appear (probably after about 1 week) and try it again. But since it takes a while to trigger the issue, I might as well set up the logging before I do that as well - as long as I know what logging options to use and the logs don't fill up my disk over the space of a week.
If you can use the GUI (using OS with desktop, VNC, X-fowarding, or other) you will be able to read log on demand.
Sorry I just have a headless VM which I access via ssh.
Which three? dbg-info? dbg-error? dbg-aps? dbg-ota? dbg-prot? dbg-wire? dbg-zdp? dbp-zcl? dbg-zcldb? dbg-http? dbg-tlink? You haven't specified what you want anywhere. I have asked which ones and you answered not with information but with a snide comment. Deconz is certainly at fault because everything else is working in my home automation VM. If all you know to do is accuse people who report bugs of "not wanting to help", I am happy to stay where I am, which is having concluded that DeConz is not fit for purpose and that it is not properly supported. I reported the bug 4 months ago and I've only been ignored.
To enable logging. Open deCONZ and click Help. After that, click Debug View. The following debug levels need to be enabled for proper logging: INFO, ERROR,ERROR_L2,APS,APS_L2.
Which translates to info2, error 2 and aps 2...
No Hard feelings here.
Ok, thanks, I've just started deconz with those debugging options, will let you know what happens. It may take a while for the problem to materialize.
Will not be easier to install a second deconz installation on a full os with desktop time for test. Or using "magic" like X-fowarding to access it on your headless server ? With that you can let your machine during a week, just enable log when you have the problem, and have a better look on mesh.
I don't think so, I'm not aware of magic which can forward X11 from a server which does not have X11 installed.
Ok after 6 days, actuator control stopped working again. However, this time, the Zigbee sensors also stopped updating. However, I have explained why this is.
I logged into the VM running deconz and the deconz process is no longer running. There is nothing obvious in the deconz log file itself to tell me why the process would have crashed. The last few lines are:
08:38:28:796 Force read attributes for ZHATemperature SensorNode Environmental Sensor-Bar Room
08:38:28:796 don't create binding for attribute reporting of sensor Environmental Sensor-Bar Room
08:38:28:796 Force binding of attribute reporting for node Environmental Sensor-Bar Room
08:38:28:804 Master: read param with arg 0x19
08:38:28:818 Current channel 15
08:38:28:826 CTRL got nwk update id 0
08:38:28:832 Device TTL 3059 s flags: 0x7
08:38:28:837 Outgoing frame counter 22217858 (0x01530482)
08:38:29:173 Poll APS request to 0x00124B001E728842 cluster: 0x0006 dropped, values are fresh enough
However, /var/log/messages
is more revealing. Ignore the "homeassistant" hostname: the VM used to run HomeAssistant, so I named the host "homeassistant", however I wiped out the hard drive and installed OpenHAB and HomeAssistant was never installed on this hard disk. I've redacted the ConBee II serial number.
Apr 11 08:38:29 homeassistant kernel: [621451.406074] QNetworkAccessM[27230]: segfault at 0 ip 00007f83f1be9b84 sp 00007f83e75148c8 error 4 in libQt5Network.so.5.11.3[7f83f1b8c000+113000]
Apr 11 08:38:29 homeassistant kernel: [621451.406089] Code: 1c 00 00 00 00 48 8b 7b 10 eb df 0f 1f 00 c7 47 1c 00 00 00 00 c6 47 50 00 c3 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b 7f 78 <48> 8b 07 ff 60 20 66 0f 1f 44 00 00 53 48 89 fb bf 18 00 00 00 e8
Apr 11 09:29:28 homeassistant kernel: [624510.502907] usb 1-3: USB disconnect, device number 4
Apr 11 09:29:29 homeassistant kernel: [624511.297264] usb 1-3: new full-speed USB device number 5 using xhci_hcd
Apr 11 09:29:29 homeassistant kernel: [624511.598648] usb 1-3: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
Apr 11 09:29:29 homeassistant kernel: [624511.598651] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Apr 11 09:29:29 homeassistant kernel: [624511.598651] usb 1-3: Product: ConBee II
Apr 11 09:29:29 homeassistant kernel: [624511.598652] usb 1-3: Manufacturer: dresden elektronik ingenieurtechnik GmbH
Apr 11 09:29:29 homeassistant kernel: [624511.598653] usb 1-3: SerialNumber: DExxxxxxx
Apr 11 09:29:29 homeassistant kernel: [624511.600942] cdc_acm 1-3:1.0: ttyACM1: USB ACM device
Apr 11 09:29:32 homeassistant kernel: [624514.309534] usb 1-3: USB disconnect, device number 5
Apr 11 09:29:33 homeassistant kernel: [624515.117148] usb 1-3: new full-speed USB device number 6 using xhci_hcd
Apr 11 09:29:33 homeassistant kernel: [624515.418425] usb 1-3: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
Apr 11 09:29:33 homeassistant kernel: [624515.418430] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Apr 11 09:29:33 homeassistant kernel: [624515.418432] usb 1-3: Product: ConBee II
Apr 11 09:29:33 homeassistant kernel: [624515.418434] usb 1-3: Manufacturer: dresden elektronik ingenieurtechnik GmbH
Apr 11 09:29:33 homeassistant kernel: [624515.418436] usb 1-3: SerialNumber: DExxxxxxx
Apr 11 09:29:33 homeassistant kernel: [624515.428547] cdc_acm 1-3:1.0: ttyACM1: USB ACM device
Apr 11 10:29:40 homeassistant kernel: [628122.599500] usb 1-3: USB disconnect, device number 6
Apr 11 10:29:41 homeassistant kernel: [628123.440325] usb 1-3: new full-speed USB device number 7 using xhci_hcd
Apr 11 10:29:41 homeassistant kernel: [628123.743031] usb 1-3: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
Apr 11 10:29:41 homeassistant kernel: [628123.743035] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Apr 11 10:29:41 homeassistant kernel: [628123.743036] usb 1-3: Product: ConBee II
Apr 11 10:29:41 homeassistant kernel: [628123.743037] usb 1-3: Manufacturer: dresden elektronik ingenieurtechnik GmbH
Apr 11 10:29:41 homeassistant kernel: [628123.743038] usb 1-3: SerialNumber: DExxxxxxx
Apr 11 10:29:41 homeassistant kernel: [628123.751699] cdc_acm 1-3:1.0: ttyACM1: USB ACM device
Apr 11 10:29:44 homeassistant kernel: [628126.436935] usb 1-3: USB disconnect, device number 7
Apr 11 10:29:45 homeassistant kernel: [628127.300243] usb 1-3: new full-speed USB device number 8 using xhci_hcd
Apr 11 10:29:45 homeassistant kernel: [628127.601273] usb 1-3: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
Apr 11 10:29:45 homeassistant kernel: [628127.601276] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Apr 11 10:29:45 homeassistant kernel: [628127.601277] usb 1-3: Product: ConBee II
Apr 11 10:29:45 homeassistant kernel: [628127.601278] usb 1-3: Manufacturer: dresden elektronik ingenieurtechnik GmbH
Apr 11 10:29:45 homeassistant kernel: [628127.601279] usb 1-3: SerialNumber: DExxxxxxx
Apr 11 10:29:45 homeassistant kernel: [628127.604996] cdc_acm 1-3:1.0: ttyACM1: USB ACM device
Apr 11 11:29:44 homeassistant kernel: [631726.538851] usb 1-3: USB disconnect, device number 8
Apr 11 11:29:45 homeassistant kernel: [631727.348671] usb 1-3: new full-speed USB device number 9 using xhci_hcd
Apr 11 11:29:45 homeassistant kernel: [631727.655544] usb 1-3: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
Apr 11 11:29:45 homeassistant kernel: [631727.655548] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Apr 11 11:29:45 homeassistant kernel: [631727.655550] usb 1-3: Product: ConBee II
Apr 11 11:29:45 homeassistant kernel: [631727.655552] usb 1-3: Manufacturer: dresden elektronik ingenieurtechnik GmbH
Apr 11 11:29:45 homeassistant kernel: [631727.655554] usb 1-3: SerialNumber: DExxxxxxx
Apr 11 11:29:45 homeassistant kernel: [631727.665841] cdc_acm 1-3:1.0: ttyACM1: USB ACM device
Apr 11 11:29:47 homeassistant kernel: [631730.119263] usb 1-3: USB disconnect, device number 9
Apr 11 11:29:48 homeassistant kernel: [631731.004794] usb 1-3: new full-speed USB device number 10 using xhci_hcd
Apr 11 11:29:49 homeassistant kernel: [631731.311879] usb 1-3: New USB device found, idVendor=1cf1, idProduct=0030, bcdDevice= 1.00
Apr 11 11:29:49 homeassistant kernel: [631731.311883] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Apr 11 11:29:49 homeassistant kernel: [631731.311885] usb 1-3: Product: ConBee II
Apr 11 11:29:49 homeassistant kernel: [631731.311887] usb 1-3: Manufacturer: dresden elektronik ingenieurtechnik GmbH
Apr 11 11:29:49 homeassistant kernel: [631731.311889] usb 1-3: SerialNumber: DExxxxxxx
Apr 11 11:29:49 homeassistant kernel: [631731.322612] cdc_acm 1-3:1.0: ttyACM1: USB ACM device
QNetworkAccessM seems to cause a segfault, deconz stops logging at exactly the same time (presumably that's when the deconz process crashed). 51 minutes later, the ConBee II appears to start disconnecting and reconnecting exactly every 60 minutes.
The reason, presumably, that the sensors also stopped working on this occasion, whereas normally they resume working, is that on this occasion, I stopped the deconz
service and ran deconz manually. Had the service been running, it would have automatically restarted deconz
after it crashed. However, this doesn't tell me why only the sensors start working again after such a restart, but not the actuators. With the ConBee II disconnecting and reconnecting, I guess it's surprising anything at all works, however I don't think these "disconnects" are physical disconnects, but something to do with the segfault which suggests some kind of a driver problem to me.
Any ideas, or anything else I should look for in the log files and provide here?
The problem has reoccurred, but this time, deconz
has not crashed, and there is no QNetworkAccessM
segfault line in /var/log/messages
. This time, sensors are continuing to provide readings, but actuators stopped working again. The log file is now 3GB in size and continuing to update, but I can't find anything obviously wrong in it. As before, a Zigbee smart plug is providing its power consumption reading, but sending it ON or OFF commands has stopped working.
If you have any suggestions or any experiments I can run which might provide useful information, can you please let me know within the next 48 hours, after which I will restart deconz and/or the VM again, and probably get another USB stick because after 5 months, I've made no progress on this problem, the ConBee II / deconz is the weak link my home automation setup which is stopping me building an actual usable home automation system and I think it's time to switch to another Zigbee solution.
Ok, I'm restarting deconz and deleting the log file because it's taken over my disk. It seems that this software is not being maintained, so I'm going to buy and try another Zigbee stick which uses OpenHAB's Zigbee binding rather than Deconz.
Asked @manup to check in :)
As there has not been any response in 21 days, this issue has been automatically marked as stale. At OP: Please either close this issue or keep it active It will be closed in 7 days if no further activity occurs.
Not really sure what I can do to keep the issue alive. It is not resolved, however.
As there has not been any response in 21 days, this issue has been automatically marked as stale. At OP: Please either close this issue or keep it active It will be closed in 7 days if no further activity occurs.
The robot may think it stale, but the bug continues to exist.
Sorry for the long delay, which deCONZ and firmware version are you currently using?
I have the Sonoff relay here and will add it to my test network for observation. Unfortunately I don't have the Samsung SmartThings smart plug.
Firmware 26660700. Phoscon says the firmware is up to date. deCONZ version 2.05.84-debian-buster-stable.
Ok, there have been quite some improvements in recent beta versions which might come into play here. I'd suggest to either install the latest beta version v2.12.1-beta — or wait for the next stable release.
Not sure if the firmware version is too important in this case, there are newer versions with various fixes as shown in the Firmware Changelog. New firmware versions can be installed manually as described in Update deCONZ Manually.
As there has not been any response in 21 days, this issue has been automatically marked as stale. At OP: Please either close this issue or keep it active It will be closed in 7 days if no further activity occurs.
Thanks. When do you think there will be a new stable release? I've currently installed deconz from the deconz apt repository using apt
. If the wait is not too long, it'll be easier to wait for the new release rather than mix and match the apt installed packages and hand compiled ones, which might work or it might create a mess, especially if installing beta versions.
As there has not been any response in 21 days, this issue has been automatically marked as stale. At OP: Please either close this issue or keep it active It will be closed in 7 days if no further activity occurs.
I have updated deconz to 2.11.05 and now nothing works at all. To be honest, I've lost interest in deconz, this is just to say that this problem still has not been fixed. Obviously, this project is nowhere near deployment ready.
@tslivnik I think that's a bit short to say. There's over 100k installations running without issues.
Either way, I think i already see the issue here: Virtualbox. Did you try any other system?
It's a deconz issue, not a Virtualbox issue - Virtualbox works fine with my Z-Wave USB stick and all the other USB devices.
I've seen more issues with VirtualBox then with any other platform. Either way, i just have 2 more suggestions:
In addition, reading trough the topic a timeline:
We still haven't received any full logs of deCONZ with the flags i've asked: https://github.com/dresden-elektronik/deconz-rest-plugin/issues/3994#issuecomment-813078885
So please, provide those and perhaps we can check there.
I have already updated the Conbee II firmware to the latest version.
Yes the issues are still only with the actuators. But now the actuators do not work at all (before the upgrade, they would work for about a week before breaking).
I've already advised that the full logs are several GB in size. I cannot upload several GB worth of logs.
I asked to be told what to grep for in the logs and I could upload the result of those searches. But I received no response.
So they are on version : RemovedURL Sorry: http://deconz.dresden-elektronik.de/deconz-firmware/beta/deCONZ_ConBeeII_0x26700700.bin.GCF
That's the beta im talking about :) Are you on that one?
check the logs for errorcodes as mentioned on the wiki.
Also, another thing, Are those ZB minis from sonoff?
Describe the bug
From time to time, deconz becomes unable to control actuators, while it continues to read from the sensors. For example, I have a SmartThings smart plug which is both a sensor (voltage, current, power) and actuator (on/off). The sensors continue updating, but control of the actuators on the same device (and all devices, e.g. Sonoff Zigbee relays) stops working. OpenHAB with the API key continues to show regular sensor updates. Restarting deconz makes no difference. Rebooting the system appears to cure the problem.
Steps to reproduce the behavior
1) Install Debian Linux, deconz and OpenHAB. 2) Add the various sensors and actuators to deconz, then to OpenHAB. 3) Run it for a while. 4) Eventually (after several days or maybe 1-2 weeks), actuators stop responding to commands, while sensors continue to report data.
Expected behavior
I would expect to be able to continue to control the actuators.
Screenshots
I have no useful screen shots, the server is running headless.
Environment
Version: 2.05.84 / 9/14/2020 Firmware: 26660700
deCONZ Logs
I ran deconz with all logging flags enabled and obtained an enormous log too big to upload, but happy to provide extracts or re-run it with limited debugging flags enabled and provide the whole such log if you tell me what flags to enable.
Additional context
Zigbee devices on the network: