deadtrickster / prometheus_rabbitmq_exporter

Prometheus.io exporter as a RabbitMQ Managment Plugin plugin
MIT License
291 stars 72 forks source link

Crashing on scraping #87

Open michael-bud opened 5 years ago

michael-bud commented 5 years ago

We're getting this error in logs when prometheus is scraping. It occurs to 3 of our 5 clustered RabbitMQ nodes and once it starts happening will happen until RabbitMQ is restarted.

2019-07-02 17:42:28.130 [error] <0.2418.0> CRASH REPORT Process <0.2418.0> with 0 neighbours crashed with reason: no function clause matching proplists:get_value(get, 0, undefined) line 215
2019-07-02 17:42:28.131 [error] <0.2417.0> Ranch listener rabbit_web_dispatch_sup_15672, connection process <0.2417.0>, stream 1 had its request process <0.2418.0> exit with reason function_clause and stacktrace [{proplists,get_value,[get,0,undefined],[{file,"proplists.erl"},{line,215}]},{prometheus_rabbitmq_core_metrics_collector,'-collect_metrics/2-lc$^0/1-0-',3,[{file,"src/collectors/prometheus_rabbitmq_core_metrics_collector.erl"},{line,177}]},{prometheus_model_helpers,create_mf,5,[{file,"/home/dead/Projects/rabbitmq/prometheus_rabbitmq_exporter/deps/prometheus/src/model/prometheus_model_helpers.erl"},{line,127}]},{prometheus_rabbitmq_core_metrics_collector,'-mf/3-lc$^2/1-1-',3,[{file,"src/collectors/prometheus_rabbitmq_core_metrics_collector.erl"},{line,169}]},{prometheus_rabbitmq_core_metrics_collector,'-collect_mf/2-lc$^0/1-0-',2,[{file,"src/collectors/prometheus_rabbitmq_core_metrics_collector.erl"},{line,153}]},{prometheus_rabbitmq_core_metrics_collector,collect_mf,2,[{file,"src/collectors/prometheus_rabbitmq_core_metrics_collector.erl"},{line,154}]},{prometheus_collector,collect_mf,3,[{file,"/home/dead/Projects/rabbitmq/prometheus_rabbitmq_exporter/deps/prometheus/src/prometheus_collector.erl"},{line,141}]},{prometheus_registry,'-collect/2-lc$^0/1-0-',3,[{file,"/home/dead/Projects/rabbitmq/prometheus_rabbitmq_exporter/deps/prometheus/src/prometheus_registry.erl"},{line,86}]}]
MitchDart commented 5 years ago

I have the same issue. I used "Google click to deploy" RabbitMQ cluster for GKE. No changes to the configs and yet I get this crash every time.

arthurdarcet commented 5 years ago

same, seeing this in a new cluster running on kubernetes with the default values from the "click to deploy". All three nodes of our cluster are showing this error, and it starts appearing immediately after restart if there is activity on the node.

jasonjamet commented 5 years ago

Hi ! I had the same issue with the exporter at version 3.7.2.5. Apparently the v3.7.9.1 fix this error: https://github.com/deadtrickster/prometheus_rabbitmq_exporter/releases/tag/v3.7.9.1

hrobertson commented 4 years ago

We started getting a 500 on scrapes from all 3 nodes with this error:

2020-03-21 06:34:16.748 [error] <0.15910.45> CRASH REPORT Process <0.15910.45> with 0 neighbours crashed with reason: no case clause matching stopped in prometheus_rabbitmq_queues_collector:'-collect_mf/2-fun-3-'/1 line 75
2020-03-21 06:34:16.748 [error] <0.15909.45> Ranch listener rabbit_web_dispatch_sup_15672, connection process <0.15909.45>, stream 1 had its request process <0.15910.45> exit with reason {case_clause,stopped} and stacktrace [
    {prometheus_rabbitmq_queues_collector,'-collect_mf/2-fun-3-',1,    [{file,"src/collectors/prometheus_rabbitmq_queues_collector.erl"},{line,75}]},
    {prometheus_rabbitmq_queues_collector,'-collect_metrics/2-lc$^1/1-0-',3,    [{file,"src/collectors/prometheus_rabbitmq_queues_collector.erl"},{line,102}]},
    {prometheus_rabbitmq_queues_collector,'-collect_metrics/2-lc$^1/1-0-',3,    [{file,"src/collectors/prometheus_rabbitmq_queues_collector.erl"},{line,102}]},
    {prometheus_model_helpers,create_mf,5,      [{file,"src/model/prometheus_model_helpers.erl"},{line,127}]},
    {prometheus_rabbitmq_queues_collector,mf,3,    [{file,"src/collectors/prometheus_rabbitmq_queues_collector.erl"},{line,94}]},
    {prometheus_rabbitmq_queues_collector,'-collect_mf/2-lc$^4/1-2-',3,    [{file,"src/collectors/prometheus_rabbitmq_queues_collector.erl"},{line,75}]},
    {prometheus_rabbitmq_queues_collector,collect_mf,2,    [{file,"src/collectors/prometheus_rabbitmq_queues_collector.erl"},{line,75}]},
    {prometheus_collector,collect_mf,3,    [{file,"src/prometheus_collector.erl"},{line,171}]}
    ]

Disabled/reenabled the plugin on all nodes - No effect Restarted RabbitMQ on all nodes - No effect Restarted the VM of one node - Now metrics working on all nodes :man_shrugging: