dresden-elektronik / deconz-rest-plugin

deCONZ REST-API plugin to control ZigBee devices
BSD 3-Clause "New" or "Revised" License
1.88k stars 485 forks source link

2.05.51 hangs on start #1051

Closed jurriaan closed 5 years ago

jurriaan commented 5 years ago

image

It keeps hanging in a 'DB cleanup' state while using 100% CPU.

manup commented 5 years ago

Related https://github.com/dresden-elektronik/deconz-rest-plugin/issues/1031#issuecomment-447921594

@jurriaan Can you please send the zll.db file to mpi@dresden-elektronik.de so I can check whats wrong here.

jurriaan commented 5 years ago

@manup done!

manup commented 5 years ago

Thanks, I've took the 2.05.51 offline for now until the issue is fixed.

https://github.com/dresden-elektronik/deconz-rest-plugin/issues/1050

manup commented 5 years ago

Checked the zll.db it looks fine. Found the issue I'll do some more checks and then upload 2.05.52 as bugfix release later on.

jurriaan commented 5 years ago

Thanks a lot :)

jurriaan commented 5 years ago

@manup I just downgraded to 2.05.50, and it looks like my configuration is broken now. Is that to be expected? The names of all devices are missing and Phoscon is showing the commissioning screen.

manup commented 5 years ago

Absolutely not :/ in the file you've just send me all is looking normal with device names and the auth apikeys.

Does it work when you close deCONZ and overwrite zll.db with the one from the mail, and open deCONZ again?

manup commented 5 years ago

Also check that deCONZ is really closed, I needed to kill it the hard way

killall -9 deCONZ
jurriaan commented 5 years ago

@manup Yeah, had to do that as well. Not sure what went wrong, but after restoring the zll.db from the mail it works again.

ebaauw commented 5 years ago

Seeing the same (on my test system). deCONZ hangs on startup, consuming 100% CPU. The GUI isn't even started. It only shows the first two messages in the log:

$ sudo systemctl status deconz-gui
● deconz-gui.service - deCONZ: ZigBee gateway -- GUI/REST API
   Loaded: loaded (/lib/systemd/system/deconz-gui.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/deconz-gui.service.d
           └─override.conf
   Active: active (running) since Mon 2018-12-17 19:13:16 CET; 7min ago
 Main PID: 27336 (deCONZ)
   CGroup: /system.slice/deconz-gui.service
           ├─27336 /usr/bin/deCONZ --http-port=80 --dbg-info=1 --dbg-aps=2 --dbg-error=1
           ├─27346 dbus-launch --autolaunch 36338e1dca924dbd9a2413fc1346afd5 --binary-syntax --close-stderr
           └─27347 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session

Dec 17 19:13:16 pi1 systemd[1]: Started deCONZ: ZigBee gateway -- GUI/REST API.
Dec 17 19:13:17 pi1 deCONZ[27336]: libEGL warning: DRI2: failed to authenticate
Dec 17 19:13:17 pi1 deCONZ[27336]: libpng warning: iCCP: known incorrect sRGB profile

Reverted back to v2.05.50, but deCONZ still hangs, even after kill -9 and resetting the RaspBee. I'm not sure if I ran the self-compiled REST API plugin with the database cleanup on this system. Database seems normal on quick inspection using sqlitebrowser.

manup commented 5 years ago

Seeing the same (on my test system). deCONZ hangs on startup, consuming 100% CPU. The GUI isn't even started. It only shows the first two messages in the log:

I've messed with a NULL pointer... the pointer won :(

Are all deCONZ instances closed (and everything holding a lock on the database)?

ps ax | grep deCONZ

2.05.52 is online now to mitigate the issues

https://github.com/dresden-elektronik/deconz-rest-plugin/releases/tag/V2_05_52

ebaauw commented 5 years ago

Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end. -- The 10 commandments for C programmers.

2.05.52 is online now

That version starts normally. Thanks, Manuel!

ebaauw commented 5 years ago

I actually made a backup before updating, but my production system starts OK using v2.05.52.

There was a 20-second delay during startup. Upon inspection, the device_descriptor table again contained duplicate rows for the faulty lumi.ctrl_ln2.aq1 switch. I purged the duplicates, and now there's no more delay during startup.

manup commented 5 years ago

I've disabled the automatic purge in 2.05.52, I'm not sure yet if it causes some heavy side effects (albeit I don't know how), since only few installations seem to have the issue it might be more safe to fix them manually.

Wim had a really bad start with 2.05.51 https://github.com/dresden-elektronik/deconz-rest-plugin/issues/1050

manup commented 5 years ago

I purged the duplicates, and now there's no more delay during startup.

The code to prevent new duplicates is active, so in theory the database should stay clean now.

jurriaan commented 5 years ago

@manup 2.05.52 works fine, thanks for the quick fix!

erikproper commented 3 years ago

I'm currently on 2.09.03-raspbian-buster-stable, and am not getting the above described behaviour ....