Closed spacehamster87 closed 2 weeks ago
Hi, We've been talking about the clustercockpit db internally, and wonder what the roadmap here is. Adding columns for new fields doesn't seem to scale, acc_util, job_energy, etc. A future nice to have feature would be, to allow custom user meta data such as application timers for instance. This won't work with the current static approach.
We were wondering if moving from sqlite to postgres would be a thing. As far as I can tell the json and jsonb column types look very well suited for the cc use case. You can index these columns and have a lot more flexibility with changing schemas.
We already have a json column with metadata. I internally tested to make the footprint metric columns a json column, means to put the stats metrics into a json. Then it can be flexible to add and remove metrics and you can have different set of metrics for eg different clusters.
sqlite has builtin json functions you can use in a query and in initial tests putting the metrics into a json column came with almost zero overhead. I tested this on our current production DB. You even can put an index on a key in the json document. I already started to implement this. It will also solve the issues with hardcoded metrics.
We currently support sqlite and MySQL/MariaDB. I think it should be possible to support Postgres as a third option. The infrastructure to support multiple databases already exists.
I still think that an in process DB is always faster than an external DB. I also did tests with DuckDB, the new kid on the block. While Statistic queries where slightly faster with DuckDB, as soon as you have an index sqlite is twice as fast. At least in my tests. Also the count() queries, something we frequently need are blazingly fast with sqlite, with DuckDB the take as long as a standard query. sqlite seems to do something special for the count() queries.
Collected here for documentation
job.project
(and maybe other columns)--SCAN TABLE job -> Run Time: real 4.769 user 1.152004 sys 3.609234 --SEARCH TABLE job USING INDEX job_by_project (project=?) -> Run Time: real 0.019 user 0.011652 sys 0.007310