Closed ubnt-michals closed 3 years ago
@ubnt-michals , is it possible to reproduce this error? Did you check if the volume has enough free space?
@joente
is it possible to reproduce this error?
Currently no, I am looking for some repro. But it happens regularly. Usually, the server gets stuck after 2 - 3 hours.
Did you check if the volume has enough free space?
Free disk space, CPU, and memory look ok.
EDIT:
I take back the memory. It looks like a memory leak.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
25a49fe19a9b unms-siridb 1.56% 1.041GiB / 17.63GiB 5.90% 6.8GB / 2.28GB 1.08GB / 168GB 12
The server takes over 1GB of RAM.
@ubnt-michals , do you approximately know how much databases, series and data points you had in SiriDB? (at the moment SiriDB was using 1 GB of ram). Do you also know what the select queries look like? Are you using tags, groups, regular expression or fixed names to select your series?
@joente Thanks. Please, disregard the issue. It looks like the database is just very slow, 20+s for a query. We feed it about 1.5 - 2k points a second from 40+ connections. It looks to me that since it's running essentially single-threaded, there must be a long backlog, and with slow disk I/O, the queries are just slow. It might also explain the gradual RAM increase (the backlog build up).
I'll try to tweak the way we feed the database. Maybe using fewer connections and sending larger chunks or just plainly sending fewer data will help.
@ubnt-michals , Did you try to set the SIRIDB_BUFFER_SYNC_INTERVAL
environment variable to something like 500? It might help to get the database faster
Some of our customers recently reported SiriDB server as being stuck. I've managed to get hold of one of those servers a took a core dump.
It looks like the server is unable to accept any connection and from the backtrace, it looks like it's stuck somewhere in libuv.
The situation is more confusing by the fact that the health check at
GET /status
is working fine and the container is not restarted.The docker image: https://hub.docker.com/layers/ubnt/unms-siridb/1.3.3/images/sha256-1f194131d97ae00595fccc0e212f0ef46a5b18a41ca530bc2c004b40879cf96f?context=explore Core dump: https://drive.google.com/file/d/1L-TfgJe4rQImBz6nG4tHvkoc7ILDi4R3/view?usp=sharing OS:
Linux unms 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
Docker:20.10.0, build 7287ab3