gilestrolab / ethoscope

a platform from monitoring animal behaviour in real time from a raspberry pi
http://lab.gilest.ro/ethoscope/
GNU General Public License v3.0
17 stars 25 forks source link

Time since backup dissapears #127

Closed pepelisu closed 4 years ago

pepelisu commented 4 years ago

Time since backup in the list of the connected ethoscopes is dissapearing, and showing and empty column.

Ethoscope_backup.service seems to run ok, however ethoscope_node.service shows this error (when checking the status)

Aug 21 10:23:08 node python[1624925]: ERROR:root:Traceback (most recent call last):
Aug 21 10:23:08 node python[1624925]:   File "/opt/ethoscope-node/node_src/scripts/server.py", line 42, in func_wrapper
Aug 21 10:23:08 node python[1624925]:     return func(*args, **kwargs)
Aug 21 10:23:08 node python[1624925]:   File "/opt/ethoscope-node/node_src/scripts/server.py", line 142, in devices
Aug 21 10:23:08 node python[1624925]:     return device_scanner.get_all_devices_info()
Aug 21 10:23:08 node python[1624925]:   File "/opt/ethoscope-node/node_src/ethoscope_node/utils/device_scanner.py", line 670, in get_all_devices_info
Aug 21 10:23:08 node python[1624925]:     all_known_ethoscopes = self._edb.getEthoscope ('all', asdict=True)
Aug 21 10:23:08 node python[1624925]:   File "/opt/ethoscope-node/node_src/ethoscope_node/utils/etho_db.py", line 299, in getEthoscope
Aug 21 10:23:08 node python[1624925]:     keys = row[0].keys()
Aug 21 10:23:08 node python[1624925]: TypeError: 'int' object is not subscriptable

Workaround: Restart ethoscope_node.service and ethoscope_backup.service

ggilestro commented 4 years ago

Can you give me more details? branch? frequency?

pepelisu commented 4 years ago

I am unsing dev branch [dev] d48352. The frequency,... I am still figuring that out, It did not happen again since the last restart from the services. However it was happening after one to three days since the tracking started. I will keep an eye on it and if this does not happen again I will close the issue.

pepelisu commented 4 years ago

I think this is happening when device_scanner is used and there is no ethoscopes online. It just happened now when all the ethoscopes where power off, so device_scanner returns an empty object.

The error is coming from the line 299 in etho_db.py, in the line above if function self._edb.getEthoscope('all'.asdict=True) is returning 0 or -1 due to an error on the commit. When manually checking the sqlite the database appers to be locked. After a deeper inspection there is 33 concurrent connections locking the database and giving a Permission denied error: The command fuser /etc/ethoscope-node.db gives:

Cannot stat file /proc/1229/fd/0: Permission denied
Cannot stat file /proc/1229/fd/1: Permission denied
Cannot stat file /proc/1229/fd/2: Permission denied
Cannot stat file /proc/1229/fd/3: Permission denied
Cannot stat file /proc/1229/fd/4: Permission denied
Cannot stat file /proc/1229/fd/5: Permission denied
Cannot stat file /proc/1229/fd/6: Permission denied
Cannot stat file /proc/1229/fd/7: Permission denied
Cannot stat file /proc/1229/fd/8: Permission denied
Cannot stat file /proc/1229/fd/9: Permission denied
Cannot stat file /proc/1229/fd/10: Permission denied
Cannot stat file /proc/1229/fd/11: Permission denied
Cannot stat file /proc/1229/fd/12: Permission denied
Cannot stat file /proc/1229/fd/13: Permission denied
Cannot stat file /proc/1229/fd/14: Permission denied
Cannot stat file /proc/1229/fd/15: Permission denied
Cannot stat file /proc/1229/fd/16: Permission denied
Cannot stat file /proc/1229/fd/18: Permission denied
Cannot stat file /proc/1229/fd/19: Permission denied
Cannot stat file /proc/1229/fd/20: Permission denied
Cannot stat file /proc/1229/fd/21: Permission denied
Cannot stat file /proc/1229/fd/22: Permission denied
Cannot stat file /proc/1229/fd/23: Permission denied
Cannot stat file /proc/1229/fd/24: Permission denied
Cannot stat file /proc/1229/fd/25: Permission denied
Cannot stat file /proc/1229/fd/26: Permission denied
Cannot stat file /proc/1229/fd/27: Permission denied
Cannot stat file /proc/1229/fd/28: Permission denied
Cannot stat file /proc/1229/fd/29: Permission denied
Cannot stat file /proc/1229/fd/30: Permission denied
Cannot stat file /proc/1229/fd/31: Permission denied
Cannot stat file /proc/1229/fd/32: Permission denied
Cannot stat file /proc/1229/fd/33: Permission denied

Process 1229 is /usr/bin/gnome-shell, by killing the process, it spawns a new process and the same thing happens again.

Changing the file permissions to 666 does not solve the problem. Stopping the services ethoscope_node and ethoscope_backup and killing the process does not unlock the database, still too many connections? Restarting the node seems to be the only way to stop the process to re-spawn.

ggilestro commented 4 years ago

Sorry, I am confused. Is it two different issues we're talking about? The code on etho_db.py is the following:

https://github.com/gilestrolab/ethoscope/blob/0fa40ea15f409ed3e6b207c79b97eceee79b78e0/node_src/ethoscope_node/utils/etho_db.py#L280-L304

I don't understand why there are so many closed connections, though. It's not the case here:

[gg@node ~]$ sudo fuser /etc/ethoscope-node.db
/etc/ethoscope-node.db: 53238 103807
pepelisu commented 4 years ago

I am not sure if the issues are related. What I know is that backup time disappears and the only error that I found logged in the system was that problem with sqlite. I would suggest to add some error handling in line 299, because when in line 293 self.executeSQL(sql_get_ethoscope) returns -1 and if asdict == True, then row[0] will fail. Checking if row == -1 and log the error would be necessary.

I could not find the origin of so many connections but I am looking into it.

ggilestro commented 4 years ago

Ok, I've changed to this

https://github.com/gilestrolab/ethoscope/commit/7eb99d719e40b101ed84786649e432a9176ad090

Now, we would have to understand why you get permission denied (and I never do). Could this be due to some settings in your mariadb? For now, I'll close this but feel free to reopen if you manage to reproduce.