Farfetch / maestro

A load testing tool to make tests execution and analysis using JMeter.
https://Farfetch.github.io/maestro
MIT License
31 stars 3 forks source link

Prometheus integration #456

Open vitaliimelnychuk opened 2 years ago

vitaliimelnychuk commented 2 years ago

Description

We start having some performance issues with metrics rendering but more important with making analysis on top of already available metrics.

Solution

To make the performance better for real-time metrics we have to think about time-series databases. Having Prometheus as the main data source for metrics can be useful as we make integration with Grafana available by default.

Proof of concept

vitaliimelnychuk commented 2 years ago

@Farfetch/team-maestro

Since we have a lot of dynamic agents that can be provisioned for Maestro it makes sense to allow them automatically push data to Prometheus instead of scrapping metrics by Prometheus.

Another thing it's security. Agents by default don't expose any port outside. They work just on Pull basics and get regularly updated by making requests to API.

Saying that I think the best way of using Prometheus as a time-serious database we need to use Prometheus pushgateway to allow Maestro agents to send metrics directly there. This is a good way to keep metrics aggregation and sending as agent responsibility based on runner type.

The main downside is Prometheus pushgateway is going to be a single point of failure and the main performance bottleneck. In the future, we probably can have a way to use more than one gateway to scale things up but I don't see this is as a problem for the first versions.

Here is also a quick diagram of how the things will look like:

  graph TB;
     M[Maestro API] -->| GET | P[Prometheus];
     G[Grafana] --> | GET | P;
      P --> | GET | PG[Prometheus PushGateway];

      MA1[Maestro Agent]  --> | PUSH | PG
      MA2[Maestro Agent] --> | PUSH | PG;
      MA3[Maestro Agent] --> | PUSH | PG;