Optimize row_key_index to avoid causing Cassandra large partitions

zalando-zmon / kairosdb

Fast scalable time series database

Apache License 2.0

2 stars 0 forks source link

Optimize row_key_index to avoid causing Cassandra large partitions #68

Open mohabusama opened 5 years ago

mohabusama commented 5 years ago

The current implementation could cause large partitions in Cassandra. Possible solutions:

SOLUTION I Remove dependency on row_key_index

Can KairosDB work without a row_key_index?

SOLUTION II time-bucket row_key_index

Rotate row_key_index with data_points

alexkorotkikh commented 5 years ago

As an option:

SOLUTION III: Change the data model, to store metric names as metric names instead of tags, like now.

Currently: zmon.check.2018--my_instance-cpu_avg

Should be: zmon.check.2018.cpu_avg--my_instance

Then row_key_index would be partitioned not by check_id (zmon.check.2018) but by actual metric name (zmon.check.2018.cpu_avg in this case), and partition sizes may be decreased drastically.