commercetools / mongodbatlas_exporter

MongoDB Atlas exporter for Prometheus
MIT License
13 stars 8 forks source link

Sometimes, some metrics are missing #8

Open Freyert opened 3 years ago

Freyert commented 3 years ago

We sometimes rely on mongodbatlas_processes_stats_connections to list clusters in our dashboards, but we've noticed that this metric occasionally does not exist for processes.

Freyert commented 3 years ago

Two issues I'm seeing:

There are no datapoints available

{"err":"no datapoints are available","level":"warn","measurement":{"DataPoints":[],"Units":"PERCENT"},"metric":"Desc{fqName: \"mongodbatlas_processes_stats_fts_process_cpu_kernel_percent\", help: \"Original measurements.name: 'FTS_PROCESS_CPU_KERNEL'. Please see MongoDB Atlas documentation for details about the measurement\", constLabels: {}, variableLabels: [project_id rs_name user_alias]}","msg":"skipping metric because of value transformation failure","timestamp":"2021-09-30T17:54:43.989063Z"}

And this issue due to metrics not existing in the initialized map.

{"err":null,"level":"warn","metric":"Desc{fqName: \"mongodbatlas_processes_stats_oplog_master_lag_time_diff_seconds\", help: \"Original measurements.name: 'OPLOG_MASTER_LAG_TIME_DIFF'. Please see MongoDB Atlas documentation for details about the measurement\", constLabels: {}, variableLabels: [project_id rs_name user_alias]}","msg":"skipping metric because can't find matching measurement.\n\t\t\t\t\tIt seems to be not initialized during exporter start, you should restart the exporter","timestamp":"2021-09-30T17:54:43.98261Z"}
Freyert commented 3 years ago

In general it is returning 132 process metrics so the page size is probably OK. The documentation indicates the call should only return 100 by default >:(

Freyert commented 3 years ago

When you specify granularity, you must specify either period or start and end.

Atlas retrieves database metrics every 20 minutes by default. Results include data points with 20 minute intervals.

What this is telling me is that database metrics are on a 20 minute scrape, and everything else is on something else.

There is also premium monitoring which has a 10 second scrape interval. There is more on granularity there.

https://docs.atlas.mongodb.com/monitor-cluster-metrics/#premium-monitoring-granularity

Freyert commented 3 years ago

I think this is guaranteed to be an issue with the scrape interval/period.

https://github.com/commercetools/mongodbatlas_exporter/blob/master/mongodbatlas/mongodbatlas.go#L16-L19

Freyert commented 3 years ago

OK so there are the FTS metrics. Generally not interested in those because they are defined, but empty unless you use FTS...

So if they don't exist we should skip them.