jenkinsci / prometheus-plugin

Jenkins Prometheus Plugin
https://plugins.jenkins.io/prometheus/
Apache License 2.0
184 stars 151 forks source link

Last Build Duration Metrics Not Published for Concurrent Builds #697

Open yury-tyumin opened 1 month ago

yury-tyumin commented 1 month ago

Jenkins and plugins versions report

Environment ```text Paste the output here ```

What Operating System are you using (both controller, and any agents involved in the problem)?

Jenkins: 2.462.1 OS: Linux - 5.4.0-193-generic Java: 17.0.12 - Ubuntu (OpenJDK 64-Bit Server VM) prometheus:784.vea_ecaf6592eb

Reproduction steps

  1. Start a Jenkins job that runs multiple long-duration builds simultaneously (ensure that one build starts before the previous build finishes).
  2. Attempt to monitor the following metrics during and after the builds:
    • icue_jenkins_builds_last_build_duration_milliseconds
    • icue_jenkins_builds_last_stage_duration_milliseconds_summary_sum
  3. Query the Prometheus server for the metrics history related to these builds.

Expected Results

  1. The metrics should be published for each build.
  2. Prometheus should store the history of these metrics for every build, regardless of overlapping builds.

Actual Results

When multiple builds for the same job run simultaneously, these metrics are not published for every last build. This results in missing metric history in Prometheus, making it impossible to track the duration of each build individually.

Anything else?

No response

Are you interested in contributing a fix?

No response

Waschndolos commented 1 month ago

The prometheus endpoint provides the current state of the Jenkins instance. Therefore by default no historical data will be shown. You can have this functionality when you enable this configuration. This will add individual metrics for each build available for a job (note the hint on the element):

image

Another way to keep track of such metrics is to use a different plugin like e.g. the influxdb plugin and push metrics of individual builds to an influxdb after each build

yury-tyumin commented 1 month ago

Thank you for your answer! I believe there may have been a slight misunderstanding. I’m not expecting the plugin to store historical data itself—I'm already using Prometheus and Grafana for that purpose.

I’m hoping that the Last Build Duration metrics will be published while the job is still running, if that's possible. This way, I can track and monitor builds in real-time. I don’t think this would contradict the core idea of the plugin, but rather enhance its functionality by allowing better real-time monitoring.

Let me know your thoughts on this! Thank you.

Waschndolos commented 1 month ago

Ah thanks for clarification. I'll check once I've more time to think about it. Coming back..

Waschndolos commented 3 weeks ago

Looked into this. I'm kind of hesitating to do it on the last_build_duration metric as it's explicitly programmed in a way that it will skip a build object once the build is currently running. Was like it already before I took over this plugin - so somebody had an idea behind it.

What could be done (I guess :)) is to provide a new set of metrics for current runs which are only calculated when a build is running. Would you be fine with that @yury-tyumin ?

yury-tyumin commented 2 weeks ago

Hi @Waschndolos,

Thank you for looking into this!

I appreciate your suggestion, but it seems that introducing new metrics specifically for current runs might be a bit inconvenient and not the most intuitive solution. Even in the Jenkins UI, "Last Success," "Last Failure," and "Last Duration" are always displayed, so it feels like users expect these metrics to be available regardless of the build state.

Perhaps a better approach, while maintaining compatibility with older versions, would be introducing an option that preserves the current behavior but allows users to enable real-time updates if needed.

What do you think?

Waschndolos commented 2 weeks ago

Yeah, probably possible with some refactoring but these "real time metrics" basically comes down to one or two:

yury-tyumin commented 2 weeks ago

I’m not referring to "real-time" metrics for current running builds. What I’m specifically asking about are the Last Duration metrics—like default_jenkins_builds_last_build_duration_milliseconds and default_jenkins_builds_last_stage_duration_milliseconds_summary—which represent the duration of the last finished builds.

I think the term "real-time metrics" might have been misleading.

Waschndolos commented 2 weeks ago

image

ok let's draw :) So you want to see build 1 as the last_build on the 2nd prometheus cycle right?

yury-tyumin commented 2 weeks ago

Yes, that's correct. I want to see the values for "build 1" in the second cycle because "build 2" hasn't finished yet.

Waschndolos commented 2 weeks ago

Ok got it, I can add that functionality. But be aware in cases like that where you have fast builds the build 2 would never appear in the output. With the new functionality build 3 would appear but build 2 not. Just a note.
image

yury-tyumin commented 2 weeks ago

Yes, I understand. However, I believe this can be adjusted using Prometheus's "scrape_interval" and the "Collecting metrics period" options in the plugin.