outbrain / Cassibility

This is a set of Grafana dashboards for monitoring Cassandra together with a prometheus datasource.
59 stars 17 forks source link

No data points #4

Open ivnilv opened 7 years ago

ivnilv commented 7 years ago

Hello,

Thanks a lot for those dashboards.

However, I am having an issue visualizing the end results. I was able to import the dashboards in Grafana, ran a search and replace and replaced the data source to the name of my prometheus server, but end up in empty graphics saying "No data points" in all fields.

Is there something else I need to fine tune in the json files for graphs in order to have it working properly ?

Thanks,

hagay3 commented 7 years ago

Make sure cassandra actually expose the metrics from jmx exporter. http://<hostname>:5560/metrics

Dashboards includes Node exporter metrics also http://<hostname>:9100/metrics

hagay3 commented 7 years ago

In addition take a look on the installation guide https://github.com/outbrain/Cassibility/wiki/Installation-Guide

ivnilv commented 7 years ago

JMX exporter is working properly and exporting metrics at http://${hostname}:7070/metrics

Would that be a problem?

Also, I cannot quite understand the need of Prometheus proxy service in the middle.

hagay3 commented 7 years ago

proxy is not needed. Add here some of the alerts seen on this page http://${hostname}:7070/metrics

ivnilv commented 7 years ago

I see lines like:


# HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds.
# TYPE jvm_gc_collection_seconds summary
jvm_gc_collection_seconds_sum{gc="ParNew",} 0.261
jvm_gc_collection_seconds_count{gc="ParNew",} 722.0
jvm_gc_collection_seconds_sum{gc="ConcurrentMarkSweep",} 22013.557
jvm_gc_collection_seconds_count{gc="ConcurrentMarkSweep",} 14726.0
# HELP process_cpu_seconds_total CPU time used by the process in seconds.
cassandra_columnfamily_memtableoffheapsize{columnfamily="peers",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="range_xfers",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_aggregates",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_columnfamilies",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_columns",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_functions",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_keyspaces",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_triggers",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="schema_usertypes",keyspace="system",} 0.0
cassandra_columnfamily_memtableoffheapsize{columnfamily="size_estimates",keyspace="system",} 0.0
hagay3 commented 7 years ago

You are not using the same jmx exporter config that resides in the repo. Metrics name are important here as dashboards rely on the convention. https://github.com/outbrain/Cassibility/blob/master/prometheus_cassandra.yml

Metrics should be like:

cassandra_ThreadPools_CompletedTasks{ThreadPools="AntiEntropyStage",path="internal",function="Value",} 0.0
cassandra_Client_connectedNativeClients{function="Value",} 113.0
ivnilv commented 7 years ago

Ok, I have started Cassandra with jmx_exporter using the config from the repo.

But still don't get to see the data visualized in Grafana.

What I see now in the metrics are stuff like:

cassandra_Compaction_BytesCompacted{function="Count",} 88465.0

but then , the query which Cassibility is running is:

increase(cassandra_Compaction_BytesCompacted{servicename=~"$Cluster",datacenter=~"$DC",instance=~"[[instance]]"}[5m])

so again, no data.

Do I need to remove those variables or set them anywhere else ?

hagay3 commented 7 years ago

What about the templates "Cluster" ,"DC", "instance"? They have data in grafana? I can see also the square brackets is wrong [[instance]], you need to get rid of those and replace it with $instance. We have used very specific environment, you have to tweak some minor stuff to make it work end to end.

ivnilv commented 7 years ago

So, I am starting seeing some graphs populating, but most of them are still empty.

Here's the cassandra-overview-system dashboard:

screenshot from 2017-06-01 18-07-29

Graphs that are empty, I can still see the metrics that Grafana is looking for on the server itself, not sure why it's not graphing them properly...

hagay3 commented 7 years ago

It's node exporter metrics, add here some metrics from http://${hostname}:9100/metrics

ivnilv commented 7 years ago

Hi,

You can see the node_exporter metrics here:

https://pastebin.com/hN8MmnQk

hagay3 commented 7 years ago

We added extra label for 'instance' for those dashboards, by default there is might be something else.

Lets see what metrics and labels you can see on prom server. Go to the graphing in promtheous (9090 port), and take snapshot of the metric node_network_receive_multicast{} http://<prom_server>:9090/graph

Reference: https://prometheus.io/docs/introduction/getting_started/#using-the-graphing-interface

ivnilv commented 7 years ago

Here's the screenshot:

capture image

Does that mean that prometheus is scraping just node_exporter's stats and not jmx_exporter ones ?

hagay3 commented 7 years ago

You have to do some configurations tweaks for promtheous but there is many configs about prom itself, better we will come back here with a proper solution. Send me an e-mail, hagay3@gmail.com