mvisonneau / gitlab-ci-pipelines-exporter

Prometheus / OpenMetrics exporter for GitLab CI pipelines insights
Apache License 2.0
1.31k stars 245 forks source link

Lack of persistence when running gitlab-ci-pipeline-exporter with Docker Compose #862

Open mguassone-enginium opened 5 months ago

mguassone-enginium commented 5 months ago

I'm running the gitlab-ci-pipeline-exporter using Docker Compose and have encountered an issue with data persistence. Upon restarting the exporter container, all previously collected data is lost and becomes unavailable in both Prometheus and Grafana.

This poses a significant challenge for my organization, as we operate a large on-prem GitLab instance. The reindexing process that occurs after each restart is time-consuming and disruptive.

I'm wondering if this is an expected behavior or if I've misunderstood the tool's intended usage and purpose. If it's indeed a limitation, I'd like to request guidance on implementing a persistence layer to prevent data loss and avoid the need for reindexing upon container or machine restarts.

Steps to reproduce

  1. Run the gitlab-ci-pipeline-exporter container using Docker Compose.
  2. Collect pipeline data for a period of time.
  3. Restart the gitlab-ci-pipeline-exporter container.
  4. Verify that previously collected data is no longer available in Prometheus or Grafana and you must wait gitlab-ci-pipeline-exporter to reindex the content.

Possible solutions

Additional questions

Is there an official recommendation for running the gitlab-ci-pipeline-exporter in a production environment with persistence requirements? Are there any known workarounds or community-supported solutions for addressing this persistence issue? Thank you for your time and consideration of this matter. I appreciate your assistance in resolving this issue and improving the overall usability of the gitlab-ci-pipeline-exporter.

Please let me know if you have any other questions or require further clarification.

Additional Context

docker-compose.yml

---
version: '3.8'
services:
  gitlab-ci-pipelines-exporter:
    image: quay.io/mvisonneau/gitlab-ci-pipelines-exporter:v0.5.8
    ports:
      - 8080:8080
    environment:
      GCPE_GITLAB_TOKEN: ${GCPE_GITLAB_TOKEN}
      GCPE_CONFIG: /etc/gitlab-ci-pipelines-exporter.yml
      GCPE_INTERNAL_MONITORING_LISTENER_ADDRESS: tcp://127.0.0.1:8082
    volumes:
      - type: bind
        source: ./gitlab-ci-pipelines-exporter.yml
        target: /etc/gitlab-ci-pipelines-exporter.yml

  prometheus:
    image: docker.io/prom/prometheus:v2.44.0
    ports:
      - 9090:9090
    links:
      - gitlab-ci-pipelines-exporter
    user: root
    volumes:
      - ./prometheus/config.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    command:
      - "--storage.tsdb.retention.size=50GB"
      - "--storage.tsdb.retention.time=2y"
      - "--config.file=/etc/prometheus/prometheus.yml"
      - "--storage.tsdb.path=/prometheus"
      - "--web.console.libraries=/usr/share/prometheus/console_libraries"
      - "--web.console.templates=/usr/share/prometheus/consoles"

  grafana:
    image: docker.io/grafana/grafana:9.5.2
    ports:
      - 3000:3000
    environment:
      GF_AUTH_ANONYMOUS_ENABLED: 'true'
      GF_INSTALL_PLUGINS: grafana-polystat-panel,yesoreyeram-boomtable-panel
    links:
      - prometheus
    volumes:
      - ./grafana/dashboards.yml:/etc/grafana/provisioning/dashboards/default.yml
      - ./grafana/datasources.yml:/etc/grafana/provisioning/datasources/default.yml
      - ./grafana/dashboards:/var/lib/grafana/dashboards

networks:
  default:

volumes:
  prometheus-data:

gitlab-ci-pipelines-exporter.yml

---
log:
  level: debug

gitlab:
  url: ****
  token: ****

pull:
  projects_from_wildcards:
    on_init: true

  environments_from_projects:
    on_init: true

  refs_from_projects:
    on_init: true

  metrics:
    on_init: true
# Pull jobs related metrics on all projects
project_defaults:
  pull:
    pipeline:
      jobs:
        enabled: true
        most_recent: 0
        max_age_seconds: 0
    environments:
      enabled: true
      regexp: ".*"
      exclude_stopped: true
    merge_requests:
      enabled: true
      most_recent: 0
      max_age_seconds: 0
#

wildcards:
  - {}

prometheus/config.yml

global:
  scrape_interval:     15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'gitlab-ci-pipelines-exporter'
    scrape_interval: 10s
    scrape_timeout: 5s
    static_configs:
      - targets: ['gitlab-ci-pipelines-exporter:8080']
ouhzee commented 3 months ago

Yes, we also encountered the same problem Need confirmation from @mvisonneau

uncaught commented 3 months ago

Curious about this, too. I'm new here and haven't used this exporter, yet, but did you already try the redis setup? It sounds like it would do the trick because you can make redis persistence easily enough, and the docs say that all exporters would use the same single redis. So you could also just use one exporter with redis.

ouhzee commented 3 months ago

Curious about this, too. I'm new here and haven't used this exporter, yet, but did you already try the redis setup? It sounds like it would do the trick because you can make redis persistence easily enough, and the docs say that all exporters would use the same single redis. So you could also just use one exporter with redis.

Yes, I do read about the Redis setup but I haven't tried it yet. I want to refresh my knowledge. if I'm not mistaken, all of the exporters are scraped by Prometheus to store all the data that has been fetched from the exporter, isn't it? But, why do gitlab-ci-pipelines-exporter data is lost when we restart the exporter?

uncaught commented 3 months ago

The way Prometheus works is that it only sees data at the time it scrapes them. So it will scrape this exporter, see the bunch of metrics, saves them with its own current timestamp and that's it.

What this exporter does is provide this list of metrics for prometheus to scrape.

But all the metrics this exporter produces is a bunch of counts and percentages over all the pipelines/jobs. So in order to offer counts, it needs to count them again after a reboot if no persistant memory is used.


I've come to the conclusion that this exporter and prometheus is not for my use case. I want a historic view on my gitlab jobs. Like time spent on a specific unit-test to see if it increases/decreases with new commits. That is not what prometheus is built for.

I'm going to scrape these job metrics myself into a custom database and then use grafana on that database directly, not using prometheus as a go-between.