opendistro-for-elasticsearch / performance-analyzer

📈 OpenDistro Performance Analyzer
https://opendistro.github.io/
Apache License 2.0
146 stars 49 forks source link

Publish cluster stats #301

Closed amathur1893 closed 3 years ago

amathur1893 commented 3 years ago

This PR will start publishing latency and failure metrices for Publish Phase of Cluster update from Master's Persepctive. This is going to help us in RCA and live time tracking of issues where publication is failing again and agian and the cluster is unstable because of this.

1.Tested using Docker

Tmp file

^master_cluster_update {"current_time":1614584101385} {"PublishClusterState_Failure":2,"PublishClusterState_Latency":0}$

Table created

sqlite> .tables PublishClusterState_Latency PublishClusterState_Failure

Contents of the table

sqlite> select from PublishClusterState_Failure; 6.0|6.0|6.0|6.0 sqlite> select from PublishClusterState_Latency; 29.0|29.0|29.0|29.0

amathur1893 commented 3 years ago

I see that you have made the change to use WriterMetrics.MASTER_CLUSTER_UPDATE_STATS_COLLECTOR_DISABLED but the aggregator also needs to be updated to WRITER_METRICS_AGGREGATOR

It was updated then only - https://github.com/opendistro-for-elasticsearch/performance-analyzer-rca/blob/main/src/main/java/com/amazon/opendistro/elasticsearch/performanceanalyzer/rca/framework/metrics/WriterMetrics.java#L48. I see this is still not pushed. May i know what else is pending here.

codecov[bot] commented 3 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (main@130a494). Click here to learn what that means. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #301   +/-   ##
=======================================
  Coverage        ?   72.60%           
  Complexity      ?      346           
=======================================
  Files           ?       44           
  Lines           ?     2150           
  Branches        ?      150           
=======================================
  Hits            ?     1561           
  Misses          ?      480           
  Partials        ?      109           

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 130a494...f2a8344. Read the comment docs.