Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.44k stars 1.07k forks source link

`events_created` metrics are unclear #11219

Open asachs01 opened 3 years ago

asachs01 commented 3 years ago

Current Behavior

When reviewing metrics for events_created either in the UI or the Prometheus endpoint, it's unclear what these metrics actually refer to:

# HELP gl_event_processor_events_created_total Generated from Dropwizard metric import (metric=org.graylog.events.processor.aggregation.AggregationEventProcessor.5faef5de3f94ee72d0a8c0e0.events_created, type=com.codahale.metrics.Meter)
# TYPE gl_event_processor_events_created_total counter
gl_event_processor_events_created_total{node="00d2902d-a2ab-4e9c-8e49-571ce91d269c",processor_id="5faef5de3f94ee72d0a8c0e0",type="aggregation",} 875283.0
gl_event_processor_events_created_total{node="00d2902d-a2ab-4e9c-8e49-571ce91d269c",processor_id="60d386c12eb0c235245d38e2",type="aggregation",} 3067.0
gl_event_processor_events_created_total{node="00d2902d-a2ab-4e9c-8e49-571ce91d269c",processor_id="60d5edb69e7daa03e6fe806e",type="aggregation",} 0.0
gl_event_processor_events_created_total{node="00d2902d-a2ab-4e9c-8e49-571ce91d269c",processor_id="60d386c12eb0c235245d38e6",type="correlation",} 1111.0
CleanShot 2021-08-27 at 08 18 35

Expected Behavior

In an ideal world, I'd expect to be able to have metrics that have additional context to them that makes it so I don't have to try and suss out what the ID refers to.

Possible Solution

What would provide more clarity would be either some additional labels in Prometheus to provide a metric that's more easily understood by Graylog operators, or adjust the name to be more human-understandable when generating the metrics in the UI.

Context

The goal I'm trying to accomplish is to have visibility into what's going on inside of my Graylog deployment. When IDs are involved instead of human-readable references, this results in a lot of back and forth with API calls and writing Prometheus rewrite rules--it's not sustainable and I imagine that for our users and customers, this experience is less than ideal.

Your Environment

kroepke commented 3 years ago

One of the problems is that the titles of the event definitions are not stable and might contain characters not suitable for labels.

Those IDs have to be used internally to ensure uniqueness, and the exporter only sees the internal metric names, so it's not just a matter of using a different label I'm afraid.

Potentially we can add additional labels like the title, but those, as said above, need to be post-processed and are not guaranteed to be unique.

coffee-squirrel commented 1 month ago

I've been trying to improve our monitoring of Graylog (again-- I check the current state of things periodically), and keep running into this general issue; having Graylog-internal IDs instead of something meaningful (e.g. display names) makes it difficult to understand the metrics when creating and viewing dashboards.

It'd be very nice to have titles in a label and/or an "official" example of how Graylog sees these ID-based metrics being useful within a dashboard (e.g. what's Graylog Cloud doing with their dashboards?).

18275, #13169