Closed ckopparthi closed 4 months ago
Hi @ckopparthi ,
I don't know about mongo_up
but probably the real issue is Addr: mongodb-cs-cfg-0.mongodb-cs-cfg.dev.svc.cluster.local:27019, Type: Unknown, Average RTT: 0, Last error: connection() error occured during connection handshake: dial tcp: lookup mongodb-cs-cfg-0.mongodb-cs-cfg.dev.svc.cluster.local on 10.232.64.10:53: no such host
so it just can't connect. Could you please share some details on how you run it and/or maybe share more logs ?
BTW we also have https://www.percona.com/doc/kubernetes-operator-for-psmongodb/index.html that runs with monitoring side container. https://github.com/percona/percona-server-mongodb-operator
As well as DBaaS feature in PMM that could deploy clustered MongoDB: https://www.percona.com/doc/percona-monitoring-and-management/2.x/using/dbaas.html
Thanks, Denys
@denisok I restarted the mongodb-cs-cfg-0.mongodb-cs-cfg.dev.svc.cluster.local:27019
server. So it shouldn't be reachable. When the mongodb server is not reachable is guess mongo_up
should be 0
ah, I see, so you are saying metric is not sent. Would check.
@denisok Thanks for quick reply
@ckopparthi with pmm we pass compatibility flag and thus it affected by different bug. @percona-csalguero says it will be fixed by #348 so if there are no metrics it still will get you that metric,
@denisok Thanks for the update, can you please let us know when will this fix be released. Do you have any estimated time
@ckopparthi let see if we can make this: https://github.com/percona/mongodb_exporter/milestone/3
I would say before 5th Oct we should have some release.
@denisok Great news, will this be included in the latest PMM client. Should I upgrade the PMM client to get the latest mongodb_exporter
yes, should be later in 2.23.0. just for testing purpose: perconalab/pmm-client:2.23.0-rc3108
Do you mean, it still hangs in the latest version of mongodb_exporter?
On Thu, Oct 7, 2021 at 7:30 PM Denys Kondratenko @.***> wrote:
looks like that wasn't a case, we also see that in our testing. If there is no connection - it hangs.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/percona/mongodb_exporter/issues/347#issuecomment-937821527, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO227I3PGTYO72HHGA67PZDUFWRWJANCNFSM5E22KYFA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
don't know yet
In the latest version, the mongodb_up
metric is reflecting the current state.
I ran a test starting the exporter's sandbox, connected an exporter, then stopped the instance and checked the exporter's output:
# HELP mongodb_up Whether MongoDB is up.
# TYPE mongodb_up gauge
mongodb_up 0
In the exporter's log you can see the error but that doesn't breaks the exporter:
ERRO[0172] cannot run getDiagnosticData: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:17001, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp 127.0.0.1:17001: connect: connection refused }, ] }
ERRO[0172] cannot decode getDiagnosticData: <nil> for data field: unexpected data type
@ckopparthi could you please confirm the fix?
@denisok This issue is not fixed. I performed the same process to reproduce the issue.
I see these logs in the ppm-server as I restart the MongoDB Instance, which says mongo_up is set to 0
INFO[2021-10-25T10:46:08.859+00:00] time="2021-10-25T10:46:08Z" level=error msg="Cannot get node type to check if this is a mongos: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: mongodb-rs-e2e-2.mongodb-rs-e2e-rs.e2e.svc.cluster.local:27017, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp: lookup mongodb-rs-e2e-2 on XX.XXX.XX.XX:53: no such host }, ] }" agentID=/agent_id/70dc4772-ee15-457b-ba1b-3e7279487688 component=agent-process type=mongodb_exporter
INFO[2021-10-25T10:46:09.860+00:00] time="2021-10-25T10:46:09Z" level=error msg="cannot run getDiagnosticData: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: mongodb-rs-e2e-2.mongodb-rs-e2e-rs.e2e.svc.cluster.local:27017, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp: lookup mongodb-rs-e2e-2.mongodb-rs-e2e-rson XX.XXX.XX.XX:53: no such host }, ] }" agentID=/agent_id/70dc4772-ee15-457b-ba1b-3e7279487688 component=agent-process type=mongodb_exporter
But still, the same behavior metric is not updated when I query from the percona UI
This is the mongo exporter version that I am using
/usr/local/percona/pmm2/exporters/mongodb_exporter --version
mongodb_exporter - MongoDB Prometheus exporter
Version: v0.20.8
Commit: a41dd4b24fa5a335431fd2b3c8175eeb624084d2
Build date: 2021-10-19T09:55:48+0000
Hi @percona-csalguero,
I am using this image percona/pm-server for testing. I still see the issue. https://hub.docker.com/layers/percona/pmm-server/2.23.0/images/sha256-ff0bb20cba0dbfcc8929dbbba0558bb01acc933ec593717727707dce083441b4?context=explore
On Tue, Oct 26, 2021 at 5:10 PM Carlos Salguero @.***> wrote:
The exporter version is incorrect in latest rc:
./start-pmm.sh perconalab/pmm-server:2.23 Unable to find image 'perconalab/pmm-server:2.23' locally 2.23: Pulling from perconalab/pmm-server Digest: sha256:ff0bb20cba0dbfcc8929dbbba0558bb01acc933ec593717727707dce083441b4 Status: Downloaded newer image for perconalab/pmm-server:2.23 b882345154254b54324fc62b2d69c49108a0ac339b537f741fdbcd9d223c8165 ddc1ecc6f51cf7cf75a66609665209843ac5c4c9edc28896f4f967729ac36c08
docker exec pmm-server /usr/local/percona/pmm2/exporters/mongodb_exporter --version mongodb_exporter - MongoDB Prometheus exporter Version: v0.20.8 Commit: a41dd4b24fa5a335431fd2b3c8175eeb624084d2 Build date: 2021-10-19T09:55:19+0000
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/percona/mongodb_exporter/issues/347#issuecomment-951849053, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO227I4NDX4T5JYWT2ISW2DUI2HRJANCNFSM5E22KYFA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I have this bash script to start PMM server:
#!/bin/bash
IMAGE="${1:-perconalab/pmm-server:dev-latest}"
docker create -v /srv --name pmm-data ${IMAGE} /bin/true
docker run -d \
-p 80:80 \
-p 443:443 \
--volumes-from pmm-data \
--name pmm-server \
-e PERCONA_TEST_DBAAS=1 \
-e PERCONA_TEST_VERSION_SERVICE_URL=https://check-dev.percona.com/versions/v1 \
${IMAGE}
Then I ran:
./start-pmm.sh perconalab/pmm-server:2.23
got my local IP address this way:
hostname -I | awk '{print $1}'
192.168.1.200
got into the docker container and ran the commands to add a MongoDB instance (I have the mongodb_exporter sandbox running)
docker exec -ti pmm-server bash
pmm-admin config --server-insecure-tls --server-url=https://admin:admin@127.0.0.1:443
pmm-admin add mongodb --host 192.168.1.200 --port 17001 --service-name=mongors1-1 --skip-connection-check
If I check the metric:
ID=$(pmm-admin list | grep mongodb_exporter | awk '{print $4}')
curl --silent -u "pmm:$ID" 'http://localhost:42002/metrics' | grep 'mongodb_up'
Output:
# HELP mongodb_up Whether MongoDB is up.
# TYPE mongodb_up gauge
mongodb_up 1
Then in another terminal I took down the mongo instance:
docker stop mongo-1-1
the got the metric again (inside pmm container)
curl --silent -u "pmm:$ID" 'http://localhost:42002/metrics' | grep 'mongodb_up'
# HELP mongodb_up Whether MongoDB is up.
# TYPE mongodb_up gauge
mongodb_up 0
Could you please check if my steps are correct to reproduce the issue? Thanks
Yes @Carlos Salguero, Steps are correct to reproduce to issue
On Tue, Oct 26, 2021 at 6:06 PM Carlos Salguero @.***> wrote:
I have this bash script to start PMM server:
!/bin/bash
IMAGE="${1:-perconalab/pmm-server:dev-latest}" docker create -v /srv --name pmm-data ${IMAGE} /bin/true docker run -d \ -p 80:80 \ -p 443:443 \ --volumes-from pmm-data \ --name pmm-server \ -e PERCONA_TEST_DBAAS=1 \ -e PERCONA_TEST_VERSION_SERVICE_URL=https://check-dev.percona.com/versions/v1 \ ${IMAGE}
Then I ran:
./start-pmm.sh perconalab/pmm-server:2.23
got my local IP address this way:
hostname -I | awk '{print $1}' 192.168.1.200
got into the docker container and ran the commands to add a MongoDB instance (I have the mongodb_exporter sandbox running)
docker exec -ti pmm-server bash pmm-admin config --server-insecure-tls @.***:443 pmm-admin add mongodb --host 192.168.1.200 --port 17001 --service-name=mongors1-1 --skip-connection-check
If I check the metric:
ID=$(pmm-admin list | grep mongodb_exporter | awk '{print $4}') curl --silent -u "pmm:$ID" 'http://localhost:42002/metrics' | grep 'mongodb_up'
Output:
HELP mongodb_up Whether MongoDB is up.
TYPE mongodb_up gauge
mongodb_up 1
Then in another terminal I took down the mongo instance:
docker stop mongo-1-1
the got the metric again (inside pmm container)
curl --silent -u "pmm:$ID" 'http://localhost:42002/metrics' | grep 'mongodb_up'
HELP mongodb_up Whether MongoDB is up.
TYPE mongodb_up gauge
mongodb_up 0
Could you please check if my steps are correct to reproduce the issue? Thanks
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/percona/mongodb_exporter/issues/347#issuecomment-951895661, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO227I2URI7LIVKOJG5SUIDUI2OETANCNFSM5E22KYFA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@ckopparthi the step are correct, but we aren't able to reproduce it. Maybe you could enable debug logs and provide them ?
@denisok Will update you with the required trace as soon as possible.
Maybe the dubious point is the "mongo_up is set to 0" in the exporter log, but the metric name is "mongodb_up", different name caused the exception, I think. Looking forward to your answer, thanks
also not working for me, the mr for this was https://jira.percona.com/browse/PMM-8954 but I tried versions for mongodb exporters linux versions; 0.20.8, 0.20.9 and 0.30.0 and they don't work (I have a non container environment)! The exporters log gives: "Cannot connect to MongoDB, server slelection error, context canceled bla bla"....
This is kind of a big deal, because I installed the exporters just for that single info to see if the mongodb is running or not!
My server is Ubuntu 18, and the version of mongo running is 4.4.1
tnx, Tom
@tmikulin could you please provide more detail? If mongodb_exporter couldn't connect to mongo - it couldn't report any issues, how it would know if mongo up or down. When it is connected - yes it should report the mongodb_up anyway but with 0.
In your case, does it happens on a start ? do you provide right creds?
@denisok My issue is as described it this ticket https://jira.percona.com/browse/PMM-8954, this specificaly doesn't work "we are expecting that mongodb_exporter should respond even mongodb database stopped and it should give mongodb_up=0"
When the mongdb stoppes working, the mongodb exporter doesn't give mongdb_up=0, but an error that it can't connect to the db, so how can I get an info from the exporter if mongodb is working or not?
@tmikulin could you please provide logs? As you see original ticket is solved so it should work.
Please enable --log.level=debug
and connect to the mongodb, then get mongodb down. In parallel could you please gather curl output from the metrics endpoint - when it works and when it doesn't. Also give it a little time - 5min or so to see if there some large timeouts involved.
And please provide logs and outputs of your experiment.
We have tried to reproduce this issue, but it was working on our side.
I think it's the version of the mongodb server used as to why you get different results, but to recreate it I used these files:
version: '3.7'
services:
mongodb_container:
image: mongo:latest
environment:
MONGO_INITDB_ROOT_USERNAME: root
MONGO_INITDB_ROOT_PASSWORD: rootpassword
ports:
- 27017:27017
volumes:
- mongodb_data_container:/data/db
volumes:
mongodb_data_container:
and for running the mongodb exporter I used the most recent version:
docker run -d --name mongodb_exporter -p 9216:9216 -e "MONGODB_URI=mongodb://root:rootpassword@IP_ADDRESS:27017" --ip=IP_ADDRESS percona/mongodb_exporter:0.30
Stop the mongodb server container, and the mongodb exporter just doesn't respond anymore...
tnx, Tom
@tmikulin I don't remember that we added environment support to mongodb_exporter. Could you please pass --mongodb.uri
:
docker run -d -p 9216:9216 -p 27017:27017 --ip=IP_ADDRESS percona/mongodb_exporter:0.30 --mongodb.uri=mongodb://root:rootpassword@192.168.0.19:27017
same thing man....the exporter stops working....
I am currently working on PMM-9312 and I made the fix as part of that ticket.
This issue is real problem, the exporter is not catching up when mongo doesn't run. So it fail on timeout when you try access /metrics
. As it could be seen here:
:~$ curl localhost:6157
<html>
<head>
<title>MongoDB exporter</title>
</head>
<body>
<h1>MongoDB exporter</h1>
<p><a href="/metrics">Metrics</a></p>
</body>
</html>
:~$ curl localhost:6157/metrics
An error has occurred while connecting to MongoDB:
cannot connect to MongoDB: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: localhost:27017, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp [::1]:27017: connect: connection refused }, ] }
If mongodb is up, all metrics are shown properly.
Also passing secrets as ENV is propably best solution, if you don't have config file. Passing it as --mongodb.uri parameter is how you expose secrets in process names, if you checking running processes and sending this data somewhere else.
IMO just set timeout for connection to mongodb and on timeout show mongodb_up 0
The same issue for me, mongodb_exporter couldn't set the mongodb_up=0 when mongodb instance go down, mongodb_exporter: 0.29.0, 0.31.1, 0.31.2, 0.32.0, these version do not fix this issue, Looking forward to the solution in the next release ,thanks a lot
I believe this was fixed :) please check latest version. There were couple of bugs that we fixed if mongo is not connected.
Still not returning mongodb_up 0
when mongo is down and getting empty reposnse from exporter /metrics
endpoint
Exporter version
:~$ /opt/prometheus/bin/mongodb_exporter --version
mongodb_exporter - MongoDB Prometheus exporter
Version: v0.34.0
Commit: 5c3358c
Build date: 2022-08-05T09:18:49Z
Check:
:~$ systemctl status mongod.service
● mongod.service - MongoDB Database Server
Loaded: loaded (/lib/systemd/system/mongod.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Thu 2022-09-15 08:22:06 CEST; 16s ago
Docs: https://docs.mongodb.org/manual
Process: 898445 ExecStart=/usr/bin/mongod --config /etc/mongod.conf (code=exited, status=0/SUCCESS)
Main PID: 898445 (code=exited, status=0/SUCCESS)
CPU: 1.484s
:~$ curl 'localhost:6157/metrics'
curl: (52) Empty reply from server
:~$ systemctl start mongod.service
:~$ systemctl status mongod.service
● mongod.service - MongoDB Database Server
Loaded: loaded (/lib/systemd/system/mongod.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2022-09-15 08:22:55 CEST; 3s ago
Docs: https://docs.mongodb.org/manual
Main PID: 898638 (mongod)
Memory: 162.3M
CPU: 1.242s
CGroup: /system.slice/mongod.service
└─898638 /usr/bin/mongod --config /etc/mongod.conf
:~$ curl -s 'localhost:6157/metrics' | grep "_up"
# HELP mongodb_up Whether MongoDB is up.
# TYPE mongodb_up gauge
mongodb_up 1
Error which is logging when mongo is down:
time="2022-09-15T08:30:57+02:00" level=error msg="Cannot connect to MongoDB: cannot connect to MongoDB: server selection error: context deadline exceeded, current topology: { Type: Single, Servers: [{ Addr: 127.0.0.1:27017, Type: Unknown, Last error: connection() error occurred during connection handshake: dial tcp 127.0.0.1:27017: connect: connection refused }, ] }"
2022/09/15 08:30:57 http: panic serving [::1]:53406: runtime error: invalid memory address or nil pointer dereference
goroutine 60 [running]:
net/http.(*conn).serve.func1()
/opt/hostedtoolcache/go/1.17.12/x64/src/net/http/server.go:1802 +0xb9
panic({0xb36820, 0x1264f80})
/opt/hostedtoolcache/go/1.17.12/x64/src/runtime/panic.go:1047 +0x266
go.mongodb.org/mongo-driver/mongo.newDatabase(0x0, {0xbf9cca, 0x5}, {0x0, 0x30, 0xc000088000})
/home/runner/go/pkg/mod/go.mongodb.org/mongo-driver@v1.9.1/mongo/database.go:47 +0x5c
go.mongodb.org/mongo-driver/mongo.(*Client).Database(...)
/home/runner/go/pkg/mod/go.mongodb.org/mongo-driver@v1.9.1/mongo/client.go:837
github.com/percona/mongodb_exporter/exporter.getClusterRole({0xd97ce0, 0xc000402a80}, 0x40d1f4)
/home/runner/work/mongodb_exporter/mongodb_exporter/exporter/topology_info.go:172 +0x8b
github.com/percona/mongodb_exporter/exporter.(*topologyInfo).loadLabels(0xc000472180, {0xd97ce0, 0xc000402a80})
/home/runner/work/mongodb_exporter/mongodb_exporter/exporter/topology_info.go:105 +0xeb
github.com/percona/mongodb_exporter/exporter.newTopologyInfo({0xd97ce0, 0xc000402a80}, 0x0, 0xc0003165b0)
/home/runner/work/mongodb_exporter/mongodb_exporter/exporter/topology_info.go:75 +0xb0
github.com/percona/mongodb_exporter/exporter.(*Exporter).Handler.func1({0xd94f80, 0xc00043e000}, 0xc00043a000)
/home/runner/work/mongodb_exporter/mongodb_exporter/exporter/exporter.go:304 +0x3ab
net/http.HandlerFunc.ServeHTTP(0x0, {0xd94f80, 0xc00043e000}, 0x0)
/opt/hostedtoolcache/go/1.17.12/x64/src/net/http/server.go:2047 +0x2f
net/http.(*ServeMux).ServeHTTP(0x0, {0xd94f80, 0xc00043e000}, 0xc00043a000)
/opt/hostedtoolcache/go/1.17.12/x64/src/net/http/server.go:2425 +0x149
net/http.serverHandler.ServeHTTP({0xc000420b10}, {0xd94f80, 0xc00043e000}, 0xc00043a000)
/opt/hostedtoolcache/go/1.17.12/x64/src/net/http/server.go:2879 +0x43b
net/http.(*conn).serve(0xc0004126e0, {0xd97d18, 0xc00032e180})
/opt/hostedtoolcache/go/1.17.12/x64/src/net/http/server.go:1930 +0xb08
created by net/http.(*Server).Serve
/opt/hostedtoolcache/go/1.17.12/x64/src/net/http/server.go:3034 +0x4e8
I believe this panic is fixed in main by @trvrnrth as of October 6:th (commit fffcfe8332a5de5a51618662d3f379c288fce7b0
)
But it's not part of the latest release since the release branch were created before that.
It's possible i'am not testing master, since there are release builds. Last v0.35 dont have this commit inside from 6th Oct.
fixed as part of https://github.com/percona/mongodb_exporter/pull/653
I am running the pmm-agent in the kubernetes cluster as a stateful, restarted the MongoDB pod and the mongodb_up is not set to zero
pmm-agent and pmm-server version: 2.22.0
At the MongoDB pod restart the below is the log that I get
Logs say that the
mongo_up is set to 0
. But it is not set. It shows as If the metrics are not scrapped.Output of pmm-admin status