Open brightsparc opened 2 years ago
Another option could be to use clickhouse as it is an open source fast OLAP store. It also supports a number of different log engines which could be a good fit for metrics. It also supports an embedded rocks db https://clickhouse.com/docs/en/engines/table-engines/integrations/embedded-rocksdb/
see this thread for more on engine types https://www.alibabacloud.com/blog/selecting-a-clickhouse-table-engine_597726
@brightsparc thanks for the comment. Currently we are focused on centralized tracking server implementation which will allow us to have more flexible setup on a server side. Clickhouse
is a good choice for large volumes of data and it has a powerful features for data aggregation and sampling. It's a bit problematic in a sense of deleting data, but those issues are solvable.
Taking into account requirements Aim has to the database, it's worth to consider time series databases.
Once we start working on this, we'll make sure to re-design Aim in a way that storage backend can be changed without breaking SDK and other UIs.
Are there any updates towards this goal? It will be very useful to have one implementation of a storage backend so people can start contributing with different time series databases, and their performance.
@brightsparc thanks for the comment. Currently we are focused on centralized tracking server implementation which will allow us to have more flexible setup on a server side.
Clickhouse
is a good choice for large volumes of data and it has a powerful features for data aggregation and sampling. It's a bit problematic in a sense of deleting data, but those issues are solvable. Taking into account requirements Aim has to the database, it's worth to consider time series databases. Once we start working on this, we'll make sure to re-design Aim in a way that storage backend can be changed without breaking SDK and other UIs.
I think that adding support for multiple backends would help the adoption enormously. I just learned about this project few days ago and was very impressed by its UI but without this kind of interoperability it seems like integrating it into existing infrastructure of a mature project would require too much of an effort.
π Feature
This feature request is to support a native time series database that supports the same rich query interface, but persist data in an efficient format that could also be rolled up, and optionally offloaded/archived over time.
Motivation
Aim currently writes metrics to rocks db which makes scaling out a tracking api challenging. It also is unlikely to support a multi-tenant environment which could have thousands or millions of different experiments.
Pitch
Explore open source time series options that have demonstrate the ability to capture and report on metrics at scale for example:
Alternatives
Alternatively you could explore using a SaaS solution such as Influxdb or aiven
At the other end of the spectrum rolling your own custom distributed TSDB on top of something like Apache Bookkeeper which provides an efficient write ahead log, and native offloading to cloud object stores.
Additional context
This feature would enable the solution to scale to many thousands or millions of different experiments, which would differentiate itself from mlflow.