percona / mongodb_exporter

A Prometheus exporter for MongoDB including sharding, replication and storage engines
Apache License 2.0
1.18k stars 423 forks source link

wrong metrics #540

Closed Codecaver closed 4 months ago

Codecaver commented 2 years ago

Question I have a three-node mongo replica set with one master and two slaves. And the exporter is deployed on each node. When looking at the metrics on the master node, I found such an error metric: mongodb_mongod_replset_my_state{set="rs0"} 0. Because in rs.status(), the stats value of the primary node is 1

environment

Master-node exporter docker-compose.yaml as below:


version: "3"

services:
  mongodb-exporter:
    image: percona/mongodb_exporter:0.30.0
    container_name: 'mongodb-exporter'
    command:
    - '--mongodb.uri=mongodb://mongodb-exporter:********@172.31.**.**:27017'
    - '--compatible-mode'
    network_mode: "host"
    ports:
      - '9216:9216'
    restart: always

Run command curl http://localhost:9216/metrics. Output as below: The value of mongodb_mongod_replset_my_state is 0 . And mongodb_version_info{mongodb="server version is unavailable"} 1

# HELP mongodb_mongod_locks_time_acquiring_global_microseconds_total sum of serverStatus.locks.Global.timeAcquiringMicros.[r|w]
# TYPE mongodb_mongod_locks_time_acquiring_global_microseconds_total gauge
mongodb_mongod_locks_time_acquiring_global_microseconds_total 0
# HELP mongodb_mongod_replset_my_state An integer between 0 and 10 that represents the replica state of the current member
# TYPE mongodb_mongod_replset_my_state gauge
mongodb_mongod_replset_my_state{set="rs0"} 0
# HELP mongodb_mongod_replset_oplog_head_timestamp The timestamp of the newest change in the oplog
# TYPE mongodb_mongod_replset_oplog_head_timestamp gauge
mongodb_mongod_replset_oplog_head_timestamp 1.659950878e+09
# HELP mongodb_mongod_replset_oplog_tail_timestamp The timestamp of the oldest change in the oplog
# TYPE mongodb_mongod_replset_oplog_tail_timestamp gauge
mongodb_mongod_replset_oplog_tail_timestamp 1.659932637e+09
# HELP mongodb_mongod_storage_engine The storage engine used by the MongoDB instance
# TYPE mongodb_mongod_storage_engine gauge
mongodb_mongod_storage_engine{engine="Engine is unavailable"} 1
# HELP mongodb_mongod_wiredtiger_cache_evicted_total wiredtiger cache evicted total
# TYPE mongodb_mongod_wiredtiger_cache_evicted_total gauge
mongodb_mongod_wiredtiger_cache_evicted_total 0
# HELP mongodb_myState myState
# TYPE mongodb_myState untyped
mongodb_myState{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 1
# HELP mongodb_ok ok
# TYPE mongodb_ok untyped
mongodb_ok{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 1
# HELP mongodb_optimes_appliedOpTime_t optimes.appliedOpTime.
# TYPE mongodb_optimes_appliedOpTime_t untyped
mongodb_optimes_appliedOpTime_t{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 7793
# HELP mongodb_optimes_durableOpTime_t optimes.durableOpTime.
# TYPE mongodb_optimes_durableOpTime_t untyped
mongodb_optimes_durableOpTime_t{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 7793
# HELP mongodb_optimes_lastAppliedWallTime optimes.
# TYPE mongodb_optimes_lastAppliedWallTime untyped
mongodb_optimes_lastAppliedWallTime{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 1.659950878683e+12
# HELP mongodb_optimes_lastCommittedOpTime_t optimes.lastCommittedOpTime.
# TYPE mongodb_optimes_lastCommittedOpTime_t untyped
mongodb_optimes_lastCommittedOpTime_t{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 7793
# HELP mongodb_optimes_lastCommittedWallTime optimes.
# TYPE mongodb_optimes_lastCommittedWallTime untyped
mongodb_optimes_lastCommittedWallTime{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 1.659950878508e+12
# HELP mongodb_optimes_lastDurableWallTime optimes.
# TYPE mongodb_optimes_lastDurableWallTime untyped
mongodb_optimes_lastDurableWallTime{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 1.659950878674e+12
# HELP mongodb_optimes_readConcernMajorityOpTime_t optimes.readConcernMajorityOpTime.
# TYPE mongodb_optimes_readConcernMajorityOpTime_t untyped
mongodb_optimes_readConcernMajorityOpTime_t{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 7793
# HELP mongodb_optimes_readConcernMajorityWallTime optimes.
# TYPE mongodb_optimes_readConcernMajorityWallTime untyped
mongodb_optimes_readConcernMajorityWallTime{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 1.659950878508e+12
# HELP mongodb_syncSourceId syncSourceId
# TYPE mongodb_syncSourceId untyped
mongodb_syncSourceId{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} -1
# HELP mongodb_term term
# TYPE mongodb_term untyped
mongodb_term{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 7793
# HELP mongodb_up Whether MongoDB is up.
# TYPE mongodb_up gauge
mongodb_up 1
# HELP mongodb_version_info The server version
# TYPE mongodb_version_info gauge
mongodb_version_info{mongodb="server version is unavailable"} 1
# HELP mongodb_writeMajorityCount writeMajorityCount
# TYPE mongodb_writeMajorityCount untyped
mongodb_writeMajorityCount{cl_id="5e02cee0928ba9608b353934",cl_role="",rs_nm="rs0",rs_state="0"} 2
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.69
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 16
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.5265792e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.65994920741e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 7.38021376e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

And I run rs.status() on the mongosh , it shows that : the value of state is 1

        {
            "_id" : 2,
            "name" : "172.31.**.**:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 908000,
            "optime" : {
                "ts" : Timestamp(1659951155, 11916),
                "t" : NumberLong(7793)
            },
            "optimeDate" : ISODate("2022-08-08T09:32:35Z"),
            "syncingTo" : "",
            "syncSourceHost" : "",
            "syncSourceId" : -1,
            "infoMessage" : "",
            "electionTime" : Timestamp(1659044340, 1),
            "electionDate" : ISODate("2022-07-28T21:39:00Z"),
            "configVersion" : 98053,
            "self" : true,
            "lastHeartbeatMessage" : ""

Moreover, this phenomenon only occurs on the exporter of the master node, and the indicators collected by the other two slave nodes are correct. It needs to be said that the exporter version of the three nodes is the same. Is this a bug? hope to get an answer.

vtomasr5 commented 1 year ago

Hi,

I have seen the same behaviour recently when upgrading PSMDB from v4.2.x to v4.4.x. We have a mongodb-exporter running on each instance.

In the v4.2 PSMDB nodes the metric mongodb_mongod_replset_my_state shows the right result (1 or 2) but in the v4.4 PSMDB nodes the mongodb_mongod_replset_my_state value is alway 0. The v4.4 nodes are added to the existing replicaset as a SECONDARY and hidden nodes with no priority and no votes. The replicaset is a mix versions currently due to the upgrade process.

I've also put the v4.4 nodes visible (hidden=false) but the metric results are the same.

The logs for the mongodb-exporter in debug mode show the right state but not in the prometheus metric.

I've been unable to reproduce the issue in previous environments or locally 😞

mongodb-exporter options:

  - "--compatible-mode"
  - "--mongodb.direct-connect"
  - "--mongodb.global-conn-pool"
  - "--discovering-mode"
  - "--collect-all"

Versions:

igroene commented 1 year ago

if you are running into this, make sure you are not disabling FTDC explicitley (enabled is default option):

setParameter:

diagnosticDataCollectionEnabled: true