monasca-persister database

zhangjianweibj commented 5 years ago

hello, i find monasca-psersister has finished cassandra databases as storage's work.is monasca can use this feature?at present,influxdb opensource version has no ha and cluster featuren.so it is no suitable at production environment.

witekest commented 5 years ago

You can achieve HA with InfluxDB using InfluxDB-relay. Another, even better option is to deploy several InfluxDB instances and configure persisters in different Kafka consumer groups. Measurement will be duplicated for every consumer group. You can also achieve HA on the filesystem level (e.g. using GlusterFS). For horizontal scaling you would have to do sharding.

Cassandra support is fully functional. At SUSE we offer optionally both InfluxDB and Cassandra. InfluxDB implementation is simpler and better suited for an opensource project. Thus it has wider community adoption.

matrixik commented 5 years ago

Take into account I never used Monasca with Cassandra myself. Tests for persister are running on every change but I don't know if anyone is using actively Python version of persister with Cassandra. I believe that Suse is using Cassandra but with old Java version of persister. You could probably expect lower performance than InfluxDB, probably not with small amount of data but if I remember correctly some people complained that Cassandra is slowing down even more with big amount of stored and incoming data. So to actually evaluate it you would need to test it yourself.

Sorry, I can't give you better answer.

zhangjianweibj commented 5 years ago

@witekest @matrixik ok.very thanks.at present,we use influxdb-relay as ha component and backend has three influxdb pod.but we find it is not a reliable solution.if we killed a pod ,monasca persister may access a bad one (at present ,influxdb-relay can not closed a bad service,it forward request to backend at random ),and influxdb-relay return 204 code.then monasca persister crashed.on the other hand,if a influxdb pod crashed a certain time. then it start and lost many metrics. a user get metrics from monasca api. users may get different metrics at different visit time.

now,we plan to use cassandra as storage.and dockfile has finished.but do not know the cql of cassandra to create tables that monasca persister used.can you help me,tell us detail cql of tables creation.very thanks.

zhangjianweibj commented 5 years ago

at present ,we write a cql according to monasca persister conf.py.but has some mistakes.

create schema monasca with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }; create table monasca.dimensions_metrics ( region text, tenant_id text, dimension_name text, dimension_value text, metric_name text, updated_at timestamp, primary key ((region, tenant_id, dimension_name, dimension_value), metric_name) );

create table monasca.alarm_state_history ( tenant_id text, alarm_id text, metric text, old_state text, new_state text, sub_alarms text, reason text, reason_data text, time_stamp timestamp,

primary key ((tenant_id, alarm_id, old_state, new_state, sub_alarms,reason , reason_data), metric) );

create table monasca.measurements ( tenant_id text, region text, bucket_start timestamp, metric_name text, dimensions text, time_stamp timestamp, value float, value_meta text, primary key ((tenant_id, region, bucket_start, metric_name, dimensions), time_stamp) );

witekest commented 5 years ago

You can find the schema in DevStack plugin.

witekest commented 5 years ago

@witekest @matrixik ok.very thanks.at present,we use influxdb-relay as ha component and backend has three influxdb pod.but we find it is not a reliable solution.if we killed a pod ,monasca persister may access a bad one (at present ,influxdb-relay can not closed a bad service,it forward request to backend at random ),and influxdb-relay return 204 code.then monasca persister crashed.on the other hand,if a influxdb pod crashed a certain time. then it start and lost many metrics. a user get metrics from monasca api. users may get different metrics at different visit time.

now,we plan to use cassandra as storage.and dockfile has finished.but do not know the cql of cassandra to create tables that monasca persister used.can you help me,tell us detail cql of tables creation.very thanks.

Yes, InfluxDB instance which was offline/crashed for a longer time should be removed from the query pool until recovered. If you use Kafka consumer groups for measurements replication you won't loose any measurements! They will be cached in Kafka until the messages can be consumed again.

monasca / monasca-docker

monasca-persister database #505