Closed Hosk32 closed 4 years ago
Hi,
Could you execute sh.status()
to see if there are shards registered?
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
Well, I've the same issue. Running sh.status() gives me:
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5f84971b718dcf5f1661ee66")
}
shards:
active mongoses:
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
I am running the charts on AKS version 1.17.11.
Hi,
Which parameters did you use to install the chart? We are not getting this error in our daily testing.
Well, I simply set up a new AKS cluster on Azure and than ran:
helm install my-release bitnami/mongodb-sharded
That's all what I did.
Hi,
Did the logs show anything meaningful? Was there any container restart?
Well,
kubectl logs pod
doesn't show anything meaningful.
However when I check the pods status, it looks like that:
λ kubectl get pods
NAME READY STATUS RESTARTS AGE
mongo1-mongodb-sharded-configsvr-0 1/1 Running 0 2d
mongo1-mongodb-sharded-mongos-5ccd8fbc96-dpskm 1/1 Running 0 2d
mongo1-mongodb-sharded-shard0-data-0 1/1 Running 36 2d
mongo1-mongodb-sharded-shard1-data-0 1/1 Running 36 2d
Checking the events, only only mentioning liveness probe failed
LAST SEEN TYPE REASON OBJECT MESSAGE
12m Normal Pulled pod/mongo1-mongodb-sharded-shard0-data-0 Container image "docker.io/bitnami/mongodb-sharded:4.4.1-debian-10-r12" already present on machine
12m Normal Created pod/mongo1-mongodb-sharded-shard0-data-0 Created container mongodb
12m Normal Started pod/mongo1-mongodb-sharded-shard0-data-0 Started container mongodb
6m29s Normal Pulled pod/mongo1-mongodb-sharded-shard1-data-0 Container image "docker.io/bitnami/mongodb-sharded:4.4.1-debian-10-r12" already present on machine
6m29s Normal Created pod/mongo1-mongodb-sharded-shard1-data-0 Created container mongodb
6m29s Normal Started pod/mongo1-mongodb-sharded-shard1-data-0 Started container mongodb
6m30s Warning Unhealthy pod/mongo1-mongodb-sharded-shard1-data-0 Liveness probe failed:
Hi,
Seeing the pod statuses, there were several restarts. I imagine that the initial issues that prevented the shard to be created were part of the logs of the first execution. I think that running kubectl logs
with the --previous
flag will not work now, but just in case, could you try running it?
There's short info about being unable to join the cluster:
mongodb 12:37:20.07 INFO ==> ** Starting MongoDB Sharded setup **
mongodb 12:37:20.09 INFO ==> Validating settings in MONGODB_* env vars...
mongodb 12:37:20.11 INFO ==> Initializing MongoDB Sharded...
mongodb 12:37:20.13 INFO ==> Writing keyfile for replica set authentication...
mongodb 12:37:20.14 INFO ==> Enabling authentication...
mongodb 12:37:20.15 INFO ==> Deploying MongoDB Sharded with persisted data...
mongodb 12:37:20.16 INFO ==> Trying to connect to MongoDB server mongo1-mongodb-sharded...
mongodb 12:37:20.16 INFO ==> Found MongoDB server listening at mongo1-mongodb-sharded:27017 !
mongodb 12:37:20.27 INFO ==> MongoDB server listening and working at mongo1-mongodb-sharded:27017 !
mongodb 12:37:21.58 INFO ==> Joining the shard cluster
mongodb 13:57:11.75 ERROR ==> Unable to join the sharded cluster
mongodb 13:57:11.75 INFO ==> Stopping MongoDB..
Performed another test, this time using my local cluster (minikube). Everything looks ok. So it seems the cluster has a problem when run on AKS.
Hi,
This is strange, we test AKS clusters daily and we found no issues. Could it be because of the persistence technology you are using? Does it work when not setting persistence? (just for finding the root cause of the issue)
Hi, I am currently expiriencing the same Problem. I have also tried to install a older version of the chart with the same result.
the shard is after a view hours still joining
==> ** Starting MongoDB Sharded setup **
==> Validating settings in MONGODB_* env vars...
==> Initializing MongoDB Sharded...
==> Writing keyfile for replica set authentication...
==> Enabling authentication...
==> Deploying MongoDB Sharded with persisted data...
==> Trying to connect to MongoDB server mongosharding-mongodb-sharded...
==> Found MongoDB server listening at mongosharding-mongodb-sharded:27017 !
==> MongoDB server listening and working at mongosharding-mongodb-sharded:27017 !
==> Joining the shard cluster
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("XXX")
}
shards:
active mongoses:
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
I have abosulutely no clou what I could try
Edit: Sorry I forgot to mention I am also using AKS Edit2: whenever I install the chart there is a restart of the shard pods Edit3: I now got it running I used the command
helm install mongotest azure-marketplace/mongodb-shared
And now it works
Hi,
Thanks for letting us know. Which is the difference between this second attempt and the initial one? Maybe a different chart version?
Having this issue on DOKS too :~(
Nevermind I'm stupid
Nevermind I'm stupid
Hello @darnfish , do you solved this issue?
Nevermind I'm stupid
Hello @darnfish , do you solved this issue?
Yeah I was able to solve it
01:01:32.00 INFO ==> Setting node as primary
mongodb 01:01:32.03
mongodb 01:01:32.03 Welcome to the Bitnami mongodb-sharded container
mongodb 01:01:32.03 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mongodb-sharded
mongodb 01:01:32.04 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mongodb-sharded/issues
mongodb 01:01:32.04
mongodb 01:01:32.04 INFO ==> ** Starting MongoDB Sharded setup **
mongodb 01:01:32.07 INFO ==> Validating settings in MONGODB_* env vars...
mongodb 01:01:32.09 INFO ==> Initializing MongoDB Sharded...
mongodb 01:01:32.12 INFO ==> Writing keyfile for replica set authentication...
mongodb 01:01:32.14 INFO ==> Enabling authentication...
mongodb 01:01:32.15 INFO ==> Deploying MongoDB Sharded with persisted data...
mongodb 01:01:32.17 INFO ==> Trying to connect to MongoDB server mongodb-sharded...
mongodb 01:01:32.18 INFO ==> Found MongoDB server listening at mongodb-sharded:27017 !
mongodb 01:01:32.36 INFO ==> MongoDB server listening and working at mongodb-sharded:27017 !
mongodb 01:01:33.86 INFO ==> Joining the shard cluster
mongodb 02:21:23.97 ERROR ==> Unable to join the sharded cluster
mongodb 02:21:24.01 INFO ==> Stopping MongoDB...
NAME READY STATUS RESTARTS AGE
mongodb-sharded-configsvr-0 1/1 Running 0 16d
mongodb-sharded-configsvr-1 1/1 Running 0 16d
mongodb-sharded-mongos-75fbcfb584-gvkzf 1/1 Running 3 16d
mongodb-sharded-mongos-75fbcfb584-x7xqq 1/1 Running 3 16d
mongodb-sharded-shard0-data-0 1/1 Running 306 16d
mongodb-sharded-shard1-data-0 1/1 Running 307 16d
Hi! Could you launch it with BITNAMI_DEBUG=true to see more information about the issue?
The issue for me was; the 'shardN' instance were only listening at localhost, therefor the configsrv was unable to connect to them. I think there are cases where the '/db' directory is already created, but somehow the pod gets destroyed, and now the rewrite config step is skipped. Which makes joining the sharded cluster fail.
See https://github.com/bitnami/bitnami-docker-mongodb-sharded/blob/master/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh (currently line 67)
It seems "mongodb_set_listen_all_conf" should always happen before joining the sharded cluster?
Looking at the function, at the end it always sets the listen address to all. Shouldn't that be enough? Not sure if I'm missing something.
The 'listen to all addresses' should be enabled when it performs the 'join shard in cluster'. With the current flow it is possible the 'join shard in cluster' happens when it is only listening to localhost, causing the timeouts.
This is/was the fix which worked for me during a test. Although proving something doesn't happen can be difficult.
diff --git a/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh b/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
index b04ace6..ae9bb00 100644
--- a/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
+++ b/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
@@ -78,6 +78,7 @@ mongodb_sharded_mongod_initialize() {
mongodb_set_replicasetmode_conf
fi
+ mongodb_set_listen_all_conf
if [[ "$MONGODB_SHARDING_MODE" = "shardsvr" ]] && [[ "$MONGODB_REPLICA_SET_MODE" = "primary" ]]; then
mongodb_wait_for_node "$MONGODB_MONGOS_HOST" "$MONGODB_MONGOS_PORT_NUMBER" "root" "$MONGODB_ROOT_PASSWORD"
if ! mongodb_sharded_shard_currently_in_cluster "$MONGODB_REPLICA_SET_NAME"; then
@@ -86,7 +87,6 @@ mongodb_sharded_mongod_initialize() {
info "Shard already in cluster"
fi
fi
- mongodb_set_listen_all_conf
}
@basneder could you share the configuration to reproduce the issue? Version of the chart and the values that differ from the default ones. Thanks!
Using version of the chart: 3.4.7
The changed settings from the default are minimal, just a dev/test setup with 2 shards.
configsvr:
replicas: 1
mongodbRootPassword: XXX
service:
nodePort: 30000
type: NodePort
shards: 2
shardsvr:
dataNode:
replicas: 1
resources:
limits:
memory: 512Mi
persistence:
size: 25Gi
From examining the logs, the following happens,
The shards are started (having persistent database setup), first time:
The config file is updated to 'bindAll' @ line 67
This takes too long for K8s and it fails a liveness check. the pod will be reaped (and will be restarted)
Pod is restarted
Now the initial steps are skipped (including updating the config with the 'bindAll' directive'), because the directory '/db' already exists @ line 53. The crucial skipped line is @ 67. This means the second run will only listen to localhost.
It will now try and join the shard cluster with only localhost enabled @ line 84, and this will never succeed, because the joining of the shard cluster requires to have external connectivity available
Thanks for the detailed explanation, this explains why it is happening from time to time depending on the k8s cluster and/or the values set for the probes. Would you like to send a PR improving the current logic with the fix you implemented?
I have created: https://github.com/bitnami/bitnami-docker-mongodb-sharded/pull/24
This is my first pull request, please let me know if anything is lacking
Thanks! The team will review it and provide feedback, usually within a business day
Hi, I still have this exact same issue, using chart=mongodb-sharded-4.0.21... Do you have any ideas how we could solve this? Thanks for your help!
A chart upgrade seems to solve this issue.
bitnami/mongodb-sharded:
Description
On creating a basic environment, unable to create new DB. Able to add collections to system and config databases. On attempting to create a new db get the following error;
Unable to initialize targeter for write op for collection test.temp :: caused by :: Database test not found :: caused by :: No shards found
Steps to reproduce the issue:
Create the basic environment using the chart bitnami/mongodb-sharded
After creating a user in admin database and logging in via mongo shell
On attempting to create a db and insert data receive the following error
Describe the results you received: receive the following error;
Version of Helm and Kubernetes:
helm version
:kubectl version
: