hawkular / hawkular-grafana-datasource

Hawkular Datasource for Grafana
Apache License 2.0
22 stars 35 forks source link

Unable to run hawkular query from grafana #101

Open sshingarapu opened 6 years ago

sshingarapu commented 6 years ago

Hi,

We are trying to fetch metrics in grafana from Openshift origin using hawkular-metrics. It worked well for few days with 3-5 projects. However, the moment we go to dashboard which fetch metrics from multiple (around 100 projects) we get multiple errors & mostly cassandra timeout as below:

  1. "errorMsg": "Failed to perform operation due to an error: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)"
  2. errorMsg:"Failed to perform operation due to an error: All host(s) tried for query failed (tried: hawkular-cassandra/x.x.x.x:9042 (com.datastax.driver.core.exceptions.OperationTimedOutException: [hawkular-cassandra/x.x.x.x:9042] Timed out waiting for server response

Currently hawkular-metrics is in unstable condition. We are running 2 replicas of hawkular-metrics & single replica for hawkular-cassandra & heapstar. We also observed that JVM heapsize for hawkular-metrics by default is 512 MB. However, this pod always runs more than of this memory allocation.

Please let us know if we need anyother information required.

mwringe commented 6 years ago

What version are you running? Can you please post the output of 'oc get pods -o yaml -n openshift-infra'

sshingarapu commented 6 years ago

We redeployed the hawkular metrics and the issue got resolved. Thanks for your help.

rhalagali commented 6 years ago

I think this issue may come for even cluster is not configured properly. Sometimes master node wont show.

I had below message and I got know that problem with cluster. cassandra error

sshingarapu commented 6 years ago

Hi,

I reopened this for the below query.

We also observed that JVM heapsize for hawkular-metrics by default is 512 MB. However, this pod always runs more than of this memory allocation. And when we fetch metrics from grafana for 10/30 days it is taking some time to load the metrics. How can we increase the JVM heapsize for hawkular-metrics?