dresden-elektronik / deconz-rest-plugin

deCONZ REST-API plugin to control ZigBee devices
BSD 3-Clause "New" or "Revised" License
1.88k stars 485 forks source link

Database corruption - 10000s of duplicate sensors in the sensors table #3972

Closed Rick-Jongbloed closed 3 years ago

Rick-Jongbloed commented 3 years ago

Describe the bug

Deconz was very slow to load, it took +- 15 minutes to load on a Raspberry Pi 4B. Also the backup option generated corrupt tar.gz files.

It turned out the zll.db file was 501MB. An integrity check turned out a lot of rows were out of order and there were 100000s of lines in the sensors table.

I've fixed the database by dumping the database to a .sql file and then importing it into a new database. (Found here: https://www.domoticz.com/forum/viewtopic.php?t=10984):

On importing i got a lot of 'UNIQUE constraint failed: sensors.sid' errors and the database shrunk from 500MB to 232KB.

I'm creating this issue because it could be a problem on how the sensor data is written to the database. It seems the configured unique constraint is not honored and duplicate sensor data is written to the database.

I've backed up the faulty zll.db and the transition dump.sql file, so if you would like to analyse them to improve the product, i'll be happy to provide the files.

Otherwise just let me know if i need to close this ticket because its a known (and fixed) issue.

Steps to reproduce the behavior

Add sensors to the database and Deconz will write duplicate data to the sensors table.

Expected behavior

No duplicate sensor records added to the database

Screenshots

Directory listing of files -rw-r--r-- 1 root root 479M Dec 19 19:44 zll.db <- the 'corrupt' db -rw-r--r-- 1 root root 266M Dec 19 20:25 dump.sql <- dump -rw-r--r-- 1 root root 232K Dec 19 20:28 repaired.db <- the repaired db

Environment

deCONZ Logs

None atm. I tried the debug options, but i couldn't really find out where the DB transactions are logged.

Additional context

Most of the sensors are from Xiaomi. I've seen other tickets where people report that old sensors are not deleted.

Rick-Jongbloed commented 3 years ago

If the issue returns, i'll reopen this ticket

bjornatic commented 1 year ago

How did you bypassed the "UNIQUE constraint failed" on importing the .sql dump ? Because I'm having a similar problem (huge zll.dfe file, and growing), and I tried to export / import the database like you suggested, but the import fails and stops because of "UNIQUE constraint failed" error.

I'm using "DB Browser for SQLite".

Febus123 commented 4 months ago

I am facing the same issue, now for the second time. zll.db was > 500 MB and system terribly slow. After updating all components, I just renamed the db and imported a backup to phoscon. This worked fine so far. DB was down to 270 kB. This was one hour ago. Now it is already at 852 kB. Any ideas what can cause this growth?

Smanar commented 4 months ago

Can you share or take a look in logs ? You can see it in deconz/help/debug view, I think "info"/"info l2"/"error"/"error l2" will be enought.

Febus123 commented 4 months ago

Thx for looking into this @Smanar! Pls find a log and copy of zll.db here: https://1drv.ms/f/s!AoF9yFpggmA3jMcPjwsEAocKBuznbw

DB is roughly growing by 1 MB per hour.

Smanar commented 4 months ago

Ok so I realy don't have idea what happen. What is your deconz version ?

You have something that create new rules and fill the database with it. Not specific to a device. And IDK how to generate this kind of rules, I know it's possible using phoscon, but I think you don't have it running all the time ?

spam

Febus123 commented 4 months ago

I am using deCONZ v2.26.1-beta but had same issue with previous versions and no, phoscon is not always open. I do have the phoscon app installed on my iPhone and I access the deCONZ devices through FHEM and homebridge. Would it make sense to deactivate those one after the other to see if one of those causes the creation of rules? Would be weird anyhow...

Smanar commented 4 months ago

Honnestly I don't see where it can be from, I have asked to others devs if they have idea. IDK if it's from deconz, I don't think FHEM use rules features. The issue happen at 03/01/2024 then the 15/03/2024 at

I think the "owner" in the capture is the API key used to make the rules, you can check wich one is used on your third app.

Edit: I have found it, it's phoscon

spam

Your browser was running at this moment ?

ChrisHae commented 4 months ago

This rules are created from the Phoscon App on the "Sensor Control" and the "Automations" page. Did you used any of those pages? And have you created any automation with it?

Of course it should not create thousands of rules in a second. And I'm out of clue what happend here.

Febus123 commented 4 months ago

I am not using automation, but I do have 2 sensor controls enabled: thumbnail_image

thumbnail_image

ChrisHae commented 4 months ago

Our Support received your backup file. I noticed there are duplicate uniqueids in sensors (sid) and rules (rid) table. There are duplicate switch editor rules. And there are some weird rules with a sensor uniqueid as rid. When I search for devices in the sensors table with that id, I don't find any. uniqueid_als_rule_id_and_duplicates

My assumption is that this error was caused by some special device (switch) that you used with the switch editor. Perhaps identifying this device and/or delete some switch editor entries can stop the database from growth.

And you could test to delete the 2 sensor controls. If this stops the DB growth it can help us to identify the problem.

Never the less we are currently investigating the problem further and check for errors in our software.

manup commented 4 months ago

We had a look in the database file, and it's a really strange case. It's not just rules but also sensors and nodes tables. The code for these (legacy) tables hasn't been changed in a long while, and since you mentioned that this also happened with older deCONZ versions I can only guess that there is something wrong in the file.

The newer tables don't seem to show duplicates.

Anyway I've exported the database to a .sql script and recreated the database .db from that, which got rid of the duplicates.

With deCONZ closed the following brings back a hopefully working state.

  1. Create a .sql dump file

    sqlite3 zll.db
    .output ./dump.sql
    .dump
    .exit
  2. rename zll.db to back it up

mv zll.db zll.db.bak
  1. Create a fresh zll.db from the dump without duplicates
    sqlite3 zll.db
    .read ./dump.sql
    .exit

(this does print the constraint warnings for the duplicated rows..)

Hope this helps and doesn't grow again, otherwise we need to take a different approach and import on empty database where the schema is already there.

Febus123 commented 4 months ago

Thx Manuel, appreciated ! Will give it a try asap and let you know.

Febus123 commented 4 months ago

Issue seems indeed resolved with dump / new creation of db. Many thanks, great support!