Update the endpoint that generate the prometheus metric.
Convert all the metrics to vector to supports:
operator address: This will allow us to later scrape and group by operator address.
version: the version. Useful to find operator that has not update. Also being used on our telemetry page
(Version is just a number for demo purpose on this screenshot to show how it's appear)
Metrics generator:
ping check duration|total, worker looper count
Add "task received" metrics. Right now this isn't connected with real tasks yet, but the mechanism allow us to generate metrics, and therefor can update our dashboard.
Add retrying in the sync loop so if the grpc stream is broken, we can re-connect to fetch new tasks
Change to connection pool:
Record versions and metric port. We need the metrics port to later on connect the prometheus endpoint to our server to scrape all operator metrics, group by operator address.
Fixed ENG-844
Update the endpoint that generate the prometheus metric.
Convert all the metrics to vector to supports:
(Version is just a number for demo purpose on this screenshot to show how it's appear)
Metrics generator:
Change to connection pool: