apache / apisix

The Cloud-Native API Gateway
https://apisix.apache.org/blog/
Apache License 2.0
14.45k stars 2.52k forks source link

bug: Missing metrics in Prometheus #10921

Closed andytael closed 7 months ago

andytael commented 8 months ago

Current Behavior

I don't see values such as apisix_bandwidth, apisix_http_status in Prometheus or in Grafana. There are probably more missing

Expected Behavior

Expect to see the data

Error Logs

I don't see any errors

Steps to Reproduce

Configure apisix with the following values:

  prometheus:
    # ref: https://apisix.apache.org/docs/apisix/plugins/prometheus/
    enabled: true
    # -- path of the metrics endpoint
    path: /apisix/prometheus/metrics
    # -- prefix of the metrics
    metricPrefix: apisix_
    # -- container port where the metrics are exposed
    containerPort: 9091

Configure Prometheus with the following:

     # Apache APISIX    
      - job_name: "apisix"
        metrics_path: "/apisix/prometheus/metrics"
        static_configs:
          - targets: [ "apisix-prometheus-metrics.apisix.svc.cluster.local:9091" ]

When doing curl to the endpoint I only see the following values:

# HELP apisix_etcd_modify_indexes Etcd modify index for APISIX keys
# TYPE apisix_etcd_modify_indexes gauge
apisix_etcd_modify_indexes{key="consumers"} 0
apisix_etcd_modify_indexes{key="global_rules"} 16
apisix_etcd_modify_indexes{key="max_modify_index"} 22
apisix_etcd_modify_indexes{key="prev_index"} 16
apisix_etcd_modify_indexes{key="protos"} 0
apisix_etcd_modify_indexes{key="routes"} 22
apisix_etcd_modify_indexes{key="services"} 0
apisix_etcd_modify_indexes{key="ssls"} 0
apisix_etcd_modify_indexes{key="stream_routes"} 0
apisix_etcd_modify_indexes{key="upstreams"} 0
apisix_etcd_modify_indexes{key="x_etcd_index"} 22
# HELP apisix_etcd_reachable Config server etcd reachable from APISIX, 0 is unreachable
# TYPE apisix_etcd_reachable gauge
apisix_etcd_reachable 1
# HELP apisix_http_requests_total The total number of client requests since APISIX started
# TYPE apisix_http_requests_total gauge
apisix_http_requests_total 1310
# HELP apisix_nginx_http_current_connections Number of HTTP connections
# TYPE apisix_nginx_http_current_connections gauge
apisix_nginx_http_current_connections{state="accepted"} 235
apisix_nginx_http_current_connections{state="active"} 7
apisix_nginx_http_current_connections{state="handled"} 235
apisix_nginx_http_current_connections{state="reading"} 0
apisix_nginx_http_current_connections{state="waiting"} 3
apisix_nginx_http_current_connections{state="writing"} 4
# HELP apisix_nginx_metric_errors_total Number of nginx-lua-prometheus errors
# TYPE apisix_nginx_metric_errors_total counter
apisix_nginx_metric_errors_total 0
# HELP apisix_node_info Info of APISIX node
# TYPE apisix_node_info gauge
apisix_node_info{hostname="apisix-84786c95b7-sptpv"} 1
# HELP apisix_shared_dict_capacity_bytes The capacity of each nginx shared DICT since APISIX start
# TYPE apisix_shared_dict_capacity_bytes gauge
apisix_shared_dict_capacity_bytes{name="access-tokens"} 1048576
apisix_shared_dict_capacity_bytes{name="balancer-ewma"} 10485760
apisix_shared_dict_capacity_bytes{name="balancer-ewma-last-touched-at"} 10485760
apisix_shared_dict_capacity_bytes{name="balancer-ewma-locks"} 10485760
apisix_shared_dict_capacity_bytes{name="discovery"} 1048576
apisix_shared_dict_capacity_bytes{name="etcd-cluster-health-check"} 10485760
apisix_shared_dict_capacity_bytes{name="ext-plugin"} 1048576
apisix_shared_dict_capacity_bytes{name="internal-status"} 10485760
apisix_shared_dict_capacity_bytes{name="introspection"} 10485760
apisix_shared_dict_capacity_bytes{name="jwks"} 1048576
apisix_shared_dict_capacity_bytes{name="kubernetes"} 1048576
apisix_shared_dict_capacity_bytes{name="lrucache-lock"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-api-breaker"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-conn"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-count"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-count-redis-cluster-slot-lock"} 1048576
apisix_shared_dict_capacity_bytes{name="plugin-limit-count-reset-header"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-req"} 10485760
apisix_shared_dict_capacity_bytes{name="prometheus-metrics"} 10485760
apisix_shared_dict_capacity_bytes{name="upstream-healthcheck"} 10485760
apisix_shared_dict_capacity_bytes{name="worker-events"} 10485760
# HELP apisix_shared_dict_free_space_bytes The free space of each nginx shared DICT since APISIX start
# TYPE apisix_shared_dict_free_space_bytes gauge
apisix_shared_dict_free_space_bytes{name="access-tokens"} 1032192
apisix_shared_dict_free_space_bytes{name="balancer-ewma"} 10412032
apisix_shared_dict_free_space_bytes{name="balancer-ewma-last-touched-at"} 10412032
apisix_shared_dict_free_space_bytes{name="balancer-ewma-locks"} 10412032
apisix_shared_dict_free_space_bytes{name="discovery"} 1032192
apisix_shared_dict_free_space_bytes{name="etcd-cluster-health-check"} 10412032
apisix_shared_dict_free_space_bytes{name="ext-plugin"} 1032192
apisix_shared_dict_free_space_bytes{name="internal-status"} 10407936
apisix_shared_dict_free_space_bytes{name="introspection"} 10412032
apisix_shared_dict_free_space_bytes{name="jwks"} 1032192
apisix_shared_dict_free_space_bytes{name="kubernetes"} 999424
apisix_shared_dict_free_space_bytes{name="lrucache-lock"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-api-breaker"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-conn"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-count"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-count-redis-cluster-slot-lock"} 1036288
apisix_shared_dict_free_space_bytes{name="plugin-limit-count-reset-header"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-req"} 10412032
apisix_shared_dict_free_space_bytes{name="prometheus-metrics"} 10387456
apisix_shared_dict_free_space_bytes{name="upstream-healthcheck"} 10412032
apisix_shared_dict_free_space_bytes{name="worker-events"} 10412032

Environment

shreemaan-abhishek commented 8 months ago

I recommend you to test properly because we have a test cases covering this feature.

https://github.com/apache/apisix/blob/master/t/plugin/prometheus.t#L132-L136

https://github.com/apache/apisix/blob/master/t/plugin/prometheus.t#L384-L389

Did you initiate requests properly?

andytael commented 8 months ago

I get the metrics by doing this (environment runs in a k8s cluster):

1. kubectl port-forward -n apisix service/apisix-prometheus-metrics 9091
2. curl http://localhost:9091/apisix/prometheus/metrics

The result is this, where I'd expect to see the missing metrics? Why are they missing?

# HELP apisix_etcd_modify_indexes Etcd modify index for APISIX keys
# TYPE apisix_etcd_modify_indexes gauge
apisix_etcd_modify_indexes{key="consumers"} 0
apisix_etcd_modify_indexes{key="global_rules"} 16
apisix_etcd_modify_indexes{key="max_modify_index"} 22
apisix_etcd_modify_indexes{key="prev_index"} 16
apisix_etcd_modify_indexes{key="protos"} 0
apisix_etcd_modify_indexes{key="routes"} 22
apisix_etcd_modify_indexes{key="services"} 0
apisix_etcd_modify_indexes{key="ssls"} 0
apisix_etcd_modify_indexes{key="stream_routes"} 0
apisix_etcd_modify_indexes{key="upstreams"} 0
apisix_etcd_modify_indexes{key="x_etcd_index"} 22
# HELP apisix_etcd_reachable Config server etcd reachable from APISIX, 0 is unreachable
# TYPE apisix_etcd_reachable gauge
apisix_etcd_reachable 1
# HELP apisix_http_requests_total The total number of client requests since APISIX started
# TYPE apisix_http_requests_total gauge
apisix_http_requests_total 1310
# HELP apisix_nginx_http_current_connections Number of HTTP connections
# TYPE apisix_nginx_http_current_connections gauge
apisix_nginx_http_current_connections{state="accepted"} 235
apisix_nginx_http_current_connections{state="active"} 7
apisix_nginx_http_current_connections{state="handled"} 235
apisix_nginx_http_current_connections{state="reading"} 0
apisix_nginx_http_current_connections{state="waiting"} 3
apisix_nginx_http_current_connections{state="writing"} 4
# HELP apisix_nginx_metric_errors_total Number of nginx-lua-prometheus errors
# TYPE apisix_nginx_metric_errors_total counter
apisix_nginx_metric_errors_total 0
# HELP apisix_node_info Info of APISIX node
# TYPE apisix_node_info gauge
apisix_node_info{hostname="apisix-84786c95b7-sptpv"} 1
# HELP apisix_shared_dict_capacity_bytes The capacity of each nginx shared DICT since APISIX start
# TYPE apisix_shared_dict_capacity_bytes gauge
apisix_shared_dict_capacity_bytes{name="access-tokens"} 1048576
apisix_shared_dict_capacity_bytes{name="balancer-ewma"} 10485760
apisix_shared_dict_capacity_bytes{name="balancer-ewma-last-touched-at"} 10485760
apisix_shared_dict_capacity_bytes{name="balancer-ewma-locks"} 10485760
apisix_shared_dict_capacity_bytes{name="discovery"} 1048576
apisix_shared_dict_capacity_bytes{name="etcd-cluster-health-check"} 10485760
apisix_shared_dict_capacity_bytes{name="ext-plugin"} 1048576
apisix_shared_dict_capacity_bytes{name="internal-status"} 10485760
apisix_shared_dict_capacity_bytes{name="introspection"} 10485760
apisix_shared_dict_capacity_bytes{name="jwks"} 1048576
apisix_shared_dict_capacity_bytes{name="kubernetes"} 1048576
apisix_shared_dict_capacity_bytes{name="lrucache-lock"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-api-breaker"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-conn"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-count"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-count-redis-cluster-slot-lock"} 1048576
apisix_shared_dict_capacity_bytes{name="plugin-limit-count-reset-header"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-req"} 10485760
apisix_shared_dict_capacity_bytes{name="prometheus-metrics"} 10485760
apisix_shared_dict_capacity_bytes{name="upstream-healthcheck"} 10485760
apisix_shared_dict_capacity_bytes{name="worker-events"} 10485760
# HELP apisix_shared_dict_free_space_bytes The free space of each nginx shared DICT since APISIX start
# TYPE apisix_shared_dict_free_space_bytes gauge
apisix_shared_dict_free_space_bytes{name="access-tokens"} 1032192
apisix_shared_dict_free_space_bytes{name="balancer-ewma"} 10412032
apisix_shared_dict_free_space_bytes{name="balancer-ewma-last-touched-at"} 10412032
apisix_shared_dict_free_space_bytes{name="balancer-ewma-locks"} 10412032
apisix_shared_dict_free_space_bytes{name="discovery"} 1032192
apisix_shared_dict_free_space_bytes{name="etcd-cluster-health-check"} 10412032
apisix_shared_dict_free_space_bytes{name="ext-plugin"} 1032192
apisix_shared_dict_free_space_bytes{name="internal-status"} 10407936
apisix_shared_dict_free_space_bytes{name="introspection"} 10412032
apisix_shared_dict_free_space_bytes{name="jwks"} 1032192
apisix_shared_dict_free_space_bytes{name="kubernetes"} 999424
apisix_shared_dict_free_space_bytes{name="lrucache-lock"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-api-breaker"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-conn"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-count"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-count-redis-cluster-slot-lock"} 1036288
apisix_shared_dict_free_space_bytes{name="plugin-limit-count-reset-header"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-req"} 10412032
apisix_shared_dict_free_space_bytes{name="prometheus-metrics"} 10387456
apisix_shared_dict_free_space_bytes{name="upstream-healthcheck"} 10412032
apisix_shared_dict_free_space_bytes{name="worker-events"} 10412032
kayx23 commented 8 months ago

Note that some metrics, such as apisix_batch_process_entries, are not readily visible if there are no data.

From https://docs.api7.ai/hub/prometheus#metrics

These two you mentioned should show up after some traffic to the APISIX instance, in my experience.

andytael commented 8 months ago

I did a little test run on a route and I still don't get any data.

ab -n 1000 -c 100 http://<IP_ADDRESS>/api/v2/customer/aerg45sffd

# TYPE apisix_etcd_modify_indexes gauge
apisix_etcd_modify_indexes{key="consumers"} 0
apisix_etcd_modify_indexes{key="global_rules"} 0
apisix_etcd_modify_indexes{key="max_modify_index"} 21
apisix_etcd_modify_indexes{key="prev_index"} 14
apisix_etcd_modify_indexes{key="protos"} 0
apisix_etcd_modify_indexes{key="routes"} 21
apisix_etcd_modify_indexes{key="services"} 0
apisix_etcd_modify_indexes{key="ssls"} 0
apisix_etcd_modify_indexes{key="stream_routes"} 0
apisix_etcd_modify_indexes{key="upstreams"} 0
apisix_etcd_modify_indexes{key="x_etcd_index"} 21
# HELP apisix_etcd_reachable Config server etcd reachable from APISIX, 0 is unreachable
# TYPE apisix_etcd_reachable gauge
apisix_etcd_reachable 1
# HELP apisix_http_requests_total The total number of client requests since APISIX started
# TYPE apisix_http_requests_total gauge
apisix_http_requests_total 3009
# HELP apisix_nginx_http_current_connections Number of HTTP connections
# TYPE apisix_nginx_http_current_connections gauge
apisix_nginx_http_current_connections{state="accepted"} 287
apisix_nginx_http_current_connections{state="active"} 9
apisix_nginx_http_current_connections{state="handled"} 287
apisix_nginx_http_current_connections{state="reading"} 0
apisix_nginx_http_current_connections{state="waiting"} 3
apisix_nginx_http_current_connections{state="writing"} 6
# HELP apisix_nginx_metric_errors_total Number of nginx-lua-prometheus errors
# TYPE apisix_nginx_metric_errors_total counter
apisix_nginx_metric_errors_total 0
# HELP apisix_node_info Info of APISIX node
# TYPE apisix_node_info gauge
apisix_node_info{hostname="apisix-84786c95b7-hcvg5"} 1
# HELP apisix_shared_dict_capacity_bytes The capacity of each nginx shared DICT since APISIX start
# TYPE apisix_shared_dict_capacity_bytes gauge
apisix_shared_dict_capacity_bytes{name="access-tokens"} 1048576
apisix_shared_dict_capacity_bytes{name="balancer-ewma"} 10485760
apisix_shared_dict_capacity_bytes{name="balancer-ewma-last-touched-at"} 10485760
apisix_shared_dict_capacity_bytes{name="balancer-ewma-locks"} 10485760
apisix_shared_dict_capacity_bytes{name="discovery"} 1048576
apisix_shared_dict_capacity_bytes{name="etcd-cluster-health-check"} 10485760
apisix_shared_dict_capacity_bytes{name="ext-plugin"} 1048576
apisix_shared_dict_capacity_bytes{name="internal-status"} 10485760
apisix_shared_dict_capacity_bytes{name="introspection"} 10485760
apisix_shared_dict_capacity_bytes{name="jwks"} 1048576
apisix_shared_dict_capacity_bytes{name="kubernetes"} 1048576
apisix_shared_dict_capacity_bytes{name="lrucache-lock"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-api-breaker"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-conn"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-count"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-count-redis-cluster-slot-lock"} 1048576
apisix_shared_dict_capacity_bytes{name="plugin-limit-count-reset-header"} 10485760
apisix_shared_dict_capacity_bytes{name="plugin-limit-req"} 10485760
apisix_shared_dict_capacity_bytes{name="prometheus-metrics"} 10485760
apisix_shared_dict_capacity_bytes{name="upstream-healthcheck"} 10485760
apisix_shared_dict_capacity_bytes{name="worker-events"} 10485760
# HELP apisix_shared_dict_free_space_bytes The free space of each nginx shared DICT since APISIX start
# TYPE apisix_shared_dict_free_space_bytes gauge
apisix_shared_dict_free_space_bytes{name="access-tokens"} 1032192
apisix_shared_dict_free_space_bytes{name="balancer-ewma"} 10412032
apisix_shared_dict_free_space_bytes{name="balancer-ewma-last-touched-at"} 10412032
apisix_shared_dict_free_space_bytes{name="balancer-ewma-locks"} 10412032
apisix_shared_dict_free_space_bytes{name="discovery"} 1032192
apisix_shared_dict_free_space_bytes{name="etcd-cluster-health-check"} 10412032
apisix_shared_dict_free_space_bytes{name="ext-plugin"} 1032192
apisix_shared_dict_free_space_bytes{name="internal-status"} 10407936
apisix_shared_dict_free_space_bytes{name="introspection"} 10412032
apisix_shared_dict_free_space_bytes{name="jwks"} 1032192
apisix_shared_dict_free_space_bytes{name="kubernetes"} 999424
apisix_shared_dict_free_space_bytes{name="lrucache-lock"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-api-breaker"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-conn"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-count"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-count-redis-cluster-slot-lock"} 1036288
apisix_shared_dict_free_space_bytes{name="plugin-limit-count-reset-header"} 10412032
apisix_shared_dict_free_space_bytes{name="plugin-limit-req"} 10412032
apisix_shared_dict_free_space_bytes{name="prometheus-metrics"} 10387456
apisix_shared_dict_free_space_bytes{name="upstream-healthcheck"} 10412032
apisix_shared_dict_free_space_bytes{name="worker-events"} 10412032
jzhao20230918 commented 8 months ago

same issue with v3.8.0

kayx23 commented 8 months ago

@shreemaan-abhishek ok I can actually reproduce (latest, 3.8.0)... My steps are:

Some more info: I remember running into this issue when I worked on the prometheus plugin doc for api7 site, and had a convo with @AlinsRan on this topic, though for the heck of my life I cannot find the convo anymore. After a while and with more traffic generated, the metrics showed up for me. I cannot explain the exact reason for this or under what circumstances these metrics will show up.

fbartels commented 7 months ago

If I am not mistaken these values are not part of the default configuration (they are commented in https://github.com/apache/apisix/blob/626dbae56613494c792ce56ac2801443da8190b5/conf/config-default.yaml#L591) and therefore are not collected/exposed.

https://apisix.apache.org/docs/apisix/plugins/prometheus/#specifying-metrics

QuanTran91 commented 2 months ago

I enabled it but still not work, latency and bandwidth metrics are not showed

image
fbartels commented 2 months ago

Try fixing the formatting of your yaml document (the - upstream_* lines need to be indented with two spaces.