blueswen / gunicorn-monitoring

Monitor Gunicorn application (e.g. Flask) through build-in instrumentation feature using the statsD protocol over UDP with Prometheus and Grafana.
22 stars 14 forks source link

Prometheus or Grafana doesn't see exported Gunicorn metrics #1

Open S0mbre opened 5 months ago

S0mbre commented 5 months ago

What I did

Adding everything as per the workflow suggested, I have the following docker-compose.yml:

statsd: # statsd-exporter
    image: prom/statsd-exporter:latest
    container_name: statsd
    restart: unless-stopped
    expose:
      - 9125
      - 9102
    volumes:
      - ./statsd.conf:/statsd/statsd.conf
    command:
      - --statsd.mapping-config=/statsd/statsd.conf

prometheus: # prometheus
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    command: "--config.file=/etc/prometheus/prometheus.yml"
    expose:
      - 9090
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

grafana: # grafana
    image: grafana/grafana-oss:latest
    container_name: grafana
    restart: unless-stopped
    user: "472"
    depends_on:
      - prometheus
    volumes:
      - grafana_data:/var/lib/grafana

fhback: # FastAPI app
    # some params ...
    expose:
      - 8000
    command: sh -c "gunicorn --workers 6 --timeout 60 --bind 0.0.0.0:8000 --statsd-host=statsd:9125 --statsd-prefix=fhback.app -k uvicorn.workers.UvicornWorker main:app"

prometheus.yml:

global:
  scrape_interval: 15s
  scrape_timeout: 7s
  evaluation_interval: 15s
  external_labels:
    monitor: "app"

alerting:
  alertmanagers:
  - follow_redirects: true
    enable_http2: true
    scheme: http
    timeout: 8s
    api_version: v2
    static_configs:
    - targets: []    

scrape_configs:
  - job_name: "prometheus"
    honor_timestamps: true
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    enable_http2: true
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "app"
    scrape_interval: 10s
    dns_sd_configs:
      - names: ["app"]  
        port: 8000
        type: A
        refresh_interval: 5s

  - job_name: "statsd"
    static_configs:
      - targets: ["statsd:9102"]

The only change from the original repo is app name: mine is fhback.app: gunicorn --workers 6 --timeout 60 --bind 0.0.0.0:8000 --statsd-host=statsd:9125 --statsd-prefix=fhback.app ...

The statsd.conf was copied from the repo without changes. The Grafana dashboard is added also as-is from the repo example.

Problem

However, with all the services running normally and no errors in logs, Grafana doesn't see any Gunicorn data arriving:

001

002

003

S0mbre commented 5 months ago

BTW calling /metrics from the FastAPI Swagger page, I cannot see any gunicorn* metrics either:

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 1848.0
python_gc_objects_collected_total{generation="1"} 255.0
python_gc_objects_collected_total{generation="2"} 12.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 450.0
python_gc_collections_total{generation="1"} 40.0
python_gc_collections_total{generation="2"} 3.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="11",patchlevel="4",version="3.11.4"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 4.87129088e+08
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.36855552e+08
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.70717862997e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 4.06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 16.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP http_requests_total Total number of requests by method, status and handler.
# TYPE http_requests_total counter
http_requests_total{handler="/docs",method="GET",status="2xx"} 1.0
http_requests_total{handler="/openapi.json",method="GET",status="2xx"} 1.0
# HELP http_requests_created Total number of requests by method, status and handler.
# TYPE http_requests_created gauge
http_requests_created{handler="/docs",method="GET",status="2xx"} 1.707179679390936e+09
http_requests_created{handler="/openapi.json",method="GET",status="2xx"} 1.707179679798932e+09
# HELP http_request_size_bytes Content length of incoming requests by handler. Only value of header is respected. Otherwise ignored. No percentile calculated. 
# TYPE http_request_size_bytes summary
http_request_size_bytes_count{handler="/docs"} 1.0
http_request_size_bytes_sum{handler="/docs"} 0.0
http_request_size_bytes_count{handler="/openapi.json"} 1.0
http_request_size_bytes_sum{handler="/openapi.json"} 0.0
# HELP http_request_size_bytes_created Content length of incoming requests by handler. Only value of header is respected. Otherwise ignored. No percentile calculated. 
# TYPE http_request_size_bytes_created gauge
http_request_size_bytes_created{handler="/docs"} 1.7071796793909643e+09
http_request_size_bytes_created{handler="/openapi.json"} 1.7071796797989528e+09
# HELP http_response_size_bytes Content length of outgoing responses by handler. Only value of header is respected. Otherwise ignored. No percentile calculated. 
# TYPE http_response_size_bytes summary
http_response_size_bytes_count{handler="/docs"} 1.0
http_response_size_bytes_sum{handler="/docs"} 949.0
http_response_size_bytes_count{handler="/openapi.json"} 1.0
http_response_size_bytes_sum{handler="/openapi.json"} 129710.0
# HELP http_response_size_bytes_created Content length of outgoing responses by handler. Only value of header is respected. Otherwise ignored. No percentile calculated. 
# TYPE http_response_size_bytes_created gauge
http_response_size_bytes_created{handler="/docs"} 1.707179679391007e+09
http_response_size_bytes_created{handler="/openapi.json"} 1.7071796797989886e+09
# HELP http_request_duration_highr_seconds Latency with many buckets but no API specific labels. Made for more accurate percentile calculations. 
# TYPE http_request_duration_highr_seconds histogram
http_request_duration_highr_seconds_bucket{le="0.01"} 1.0
http_request_duration_highr_seconds_bucket{le="0.025"} 1.0
http_request_duration_highr_seconds_bucket{le="0.05"} 1.0
http_request_duration_highr_seconds_bucket{le="0.075"} 1.0
http_request_duration_highr_seconds_bucket{le="0.1"} 1.0
http_request_duration_highr_seconds_bucket{le="0.25"} 2.0
http_request_duration_highr_seconds_bucket{le="0.5"} 2.0
http_request_duration_highr_seconds_bucket{le="0.75"} 2.0
http_request_duration_highr_seconds_bucket{le="1.0"} 2.0
http_request_duration_highr_seconds_bucket{le="1.5"} 2.0
http_request_duration_highr_seconds_bucket{le="2.0"} 2.0
http_request_duration_highr_seconds_bucket{le="2.5"} 2.0
http_request_duration_highr_seconds_bucket{le="3.0"} 2.0
http_request_duration_highr_seconds_bucket{le="3.5"} 2.0
http_request_duration_highr_seconds_bucket{le="4.0"} 2.0
http_request_duration_highr_seconds_bucket{le="4.5"} 2.0
http_request_duration_highr_seconds_bucket{le="5.0"} 2.0
http_request_duration_highr_seconds_bucket{le="7.5"} 2.0
http_request_duration_highr_seconds_bucket{le="10.0"} 2.0
http_request_duration_highr_seconds_bucket{le="30.0"} 2.0
http_request_duration_highr_seconds_bucket{le="60.0"} 2.0
http_request_duration_highr_seconds_bucket{le="+Inf"} 2.0
http_request_duration_highr_seconds_count 2.0
http_request_duration_highr_seconds_sum 0.14225627994164824
# HELP http_request_duration_highr_seconds_created Latency with many buckets but no API specific labels. Made for more accurate percentile calculations. 
# TYPE http_request_duration_highr_seconds_created gauge
http_request_duration_highr_seconds_created 1.7071786345713346e+09
# HELP http_request_duration_seconds Latency with only few buckets by handler. Made to be only used if aggregation by handler is important. 
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{handler="/docs",le="0.1",method="GET"} 1.0
http_request_duration_seconds_bucket{handler="/docs",le="0.5",method="GET"} 1.0
http_request_duration_seconds_bucket{handler="/docs",le="1.0",method="GET"} 1.0
http_request_duration_seconds_bucket{handler="/docs",le="+Inf",method="GET"} 1.0
http_request_duration_seconds_count{handler="/docs",method="GET"} 1.0
http_request_duration_seconds_sum{handler="/docs",method="GET"} 0.0006821411661803722
http_request_duration_seconds_bucket{handler="/openapi.json",le="0.1",method="GET"} 0.0
http_request_duration_seconds_bucket{handler="/openapi.json",le="0.5",method="GET"} 1.0
http_request_duration_seconds_bucket{handler="/openapi.json",le="1.0",method="GET"} 1.0
http_request_duration_seconds_bucket{handler="/openapi.json",le="+Inf",method="GET"} 1.0
http_request_duration_seconds_count{handler="/openapi.json",method="GET"} 1.0
http_request_duration_seconds_sum{handler="/openapi.json",method="GET"} 0.14157413877546787
# HELP http_request_duration_seconds_created Latency with only few buckets by handler. Made to be only used if aggregation by handler is important. 
# TYPE http_request_duration_seconds_created gauge
http_request_duration_seconds_created{handler="/docs",method="GET"} 1.7071796793910599e+09
http_request_duration_seconds_created{handler="/openapi.json",method="GET"} 1.7071796797990353e+09
blueswen commented 5 months ago

In this demo, the metrics are provided by statsd-exporter, but not by the application(e.g. Flask, FastAPI, etc.) itself.

You can check the metrics on statsd-exporter localhost:9102/metrics which depends on the 9102 port mapping on your machine, and ensure statsd-exporter generates the metrics correctly.

S0mbre commented 5 months ago

How can I bind statsd-exporter running in Docker to my Grafana container? What changes must be done in prometheus.yml and docker-compose.yml to start it working and see visualizations in Grafana?

blueswen commented 4 months ago

Can you check your statsd-exporter metrics on localhost:9102/metrics? There should be some Prometheus metrics. Your prometheus.yml and docker-compose.yml look fine.