discourse / prometheus_exporter

A framework for collecting and aggregating prometheus metrics
MIT License
525 stars 153 forks source link

Unable to run on Alpine Linux #236

Closed geoffharcourt closed 2 years ago

geoffharcourt commented 2 years ago

This is a huge mystery to me, but I'm having trouble running the exporter as a process on Alpine Linux. The ruby_collector_working metric never goes to 1, and all metrics posted get rejected as unregistered.

Here's the smallest reproduction I could do:

docker pull ruby:3.1.2-alpine3.1.5
docker run -p 9394:9394 -it ruby:3.1.2-alpine3.15 /bin/sh

# inside shell
gem install prometheus_exporter

prometheus_exporter --verbose -b 0.0.0.0

If I curl localhost:9394/metrics this is what I get:

# HELP ruby_collector_working Is the master process collector able to collect metrics
# TYPE ruby_collector_working gauge
ruby_collector_working 0

# HELP ruby_collector_rss total memory used by collector process
# TYPE ruby_collector_rss gauge
ruby_collector_rss 47104000

# HELP ruby_collector_metrics_total Total metrics processed by exporter web.
# TYPE ruby_collector_metrics_total counter
ruby_collector_metrics_total 0

# HELP ruby_collector_sessions_total Total send_metric sessions processed by exporter web.
# TYPE ruby_collector_sessions_total counter
ruby_collector_sessions_total 0

# HELP ruby_collector_bad_metrics_total Total mis-handled metrics by collector.
# TYPE ruby_collector_bad_metrics_total counter
ruby_collector_bad_metrics_total 0

The memory use slowly ticks up, but ruby_collector_working never reaches 1. The server responds to requests and the ping healthcheck, but the collector never seems to get going. CPU use and memory use are minimal. At first I thought this was part of our k8s setup, but I can reproduce this locally as well in a single container on its own.

geoffharcourt commented 2 years ago

This was a misunderstanding I had about when the collector_working would turn to 1. We had some issues with a custom collector that I'll file a separate issue about, but this was just normal behavior before any metrics get sent.