bitnami/mongodb-sharded - unable to create db

Hosk32 commented 4 years ago

bitnami/mongodb-sharded:

Description

On creating a basic environment, unable to create new DB. Able to add collections to system and config databases. On attempting to create a new db get the following error;

Unable to initialize targeter for write op for collection test.temp :: caused by :: Database test not found :: caused by :: No shards found

Steps to reproduce the issue:

Create the basic environment using the chart bitnami/mongodb-sharded
```
helm install --name mongodb bitnami/mongodb-sharded
```

After creating a user in admin database and logging in via mongo shell

mongodb-sharded % mongo --host 35.197.183.250 --port 27017 --authenticationDatabase admin -u admin -p

On attempting to create a db and insert data receive the following error

mongos> use test
mongos> db.artists.insert({ artistname: "The Tea Party" })

Describe the results you received: receive the following error;

WriteCommandError({
"ok" : 0,
"errmsg" : "unable to initialize targeter for write op for collection test.artists :: caused by :: Database test not found :: caused by :: No shards found",
"code" : 70,
"codeName" : "ShardNotFound",
"operationTime" : Timestamp(1576876259, 2),
"$clusterTime" : {
    "clusterTime" : Timestamp(1576876259, 2),
    "signature" : {
        "hash" : BinData(0,"eew91IPiXIqbOogeM1vMKgF37T0="),
        "keyId" : NumberLong("6772628272866918401")
    }
}
})

Version of Helm and Kubernetes:

Output of helm version:

 helm version
Client: &version.Version{SemVer:"v2.15.0", GitCommit:"c2440264ca6c078a06e088a838b0476d2fc14750", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.15.0", GitCommit:"c2440264ca6c078a06e088a838b0476d2fc14750", GitTreeState:"clean"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:13:54Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.9-gke.2", GitCommit:"0f206d1d3e361e1bfe7911e1e1c686bc9a1e0aa5", GitTreeState:"clean", BuildDate:"2019-11-25T19:35:58Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"linux/amd64"}

javsalgar commented 4 years ago

Hi,

Could you execute sh.status() to see if there are shards registered?

stale[bot] commented 4 years ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

stale[bot] commented 4 years ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

macpak commented 4 years ago

Well, I've the same issue. Running sh.status() gives me:

--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("5f84971b718dcf5f1661ee66")
  }
  shards:
  active mongoses:
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }

I am running the charts on AKS version 1.17.11.

javsalgar commented 4 years ago

Hi,

Which parameters did you use to install the chart? We are not getting this error in our daily testing.

macpak commented 4 years ago

Well, I simply set up a new AKS cluster on Azure and than ran: helm install my-release bitnami/mongodb-sharded

That's all what I did.

javsalgar commented 4 years ago

Hi,

Did the logs show anything meaningful? Was there any container restart?

macpak commented 4 years ago

Well, kubectl logs pod doesn't show anything meaningful.

However when I check the pods status, it looks like that:

λ kubectl get pods
NAME                                             READY   STATUS    RESTARTS   AGE
mongo1-mongodb-sharded-configsvr-0               1/1     Running   0          2d
mongo1-mongodb-sharded-mongos-5ccd8fbc96-dpskm   1/1     Running   0          2d
mongo1-mongodb-sharded-shard0-data-0             1/1     Running   36         2d
mongo1-mongodb-sharded-shard1-data-0             1/1     Running   36         2d

Checking the events, only only mentioning liveness probe failed

LAST SEEN   TYPE      REASON      OBJECT                                     MESSAGE
12m         Normal    Pulled      pod/mongo1-mongodb-sharded-shard0-data-0   Container image "docker.io/bitnami/mongodb-sharded:4.4.1-debian-10-r12" already present on machine
12m         Normal    Created     pod/mongo1-mongodb-sharded-shard0-data-0   Created container mongodb
12m         Normal    Started     pod/mongo1-mongodb-sharded-shard0-data-0   Started container mongodb
6m29s       Normal    Pulled      pod/mongo1-mongodb-sharded-shard1-data-0   Container image "docker.io/bitnami/mongodb-sharded:4.4.1-debian-10-r12" already present on machine
6m29s       Normal    Created     pod/mongo1-mongodb-sharded-shard1-data-0   Created container mongodb
6m29s       Normal    Started     pod/mongo1-mongodb-sharded-shard1-data-0   Started container mongodb
6m30s       Warning   Unhealthy   pod/mongo1-mongodb-sharded-shard1-data-0   Liveness probe failed:

javsalgar commented 4 years ago

Hi,

Seeing the pod statuses, there were several restarts. I imagine that the initial issues that prevented the shard to be created were part of the logs of the first execution. I think that running kubectl logs with the --previous flag will not work now, but just in case, could you try running it?

macpak commented 4 years ago

There's short info about being unable to join the cluster:

mongodb 12:37:20.07 INFO  ==> ** Starting MongoDB Sharded setup **
mongodb 12:37:20.09 INFO  ==> Validating settings in MONGODB_* env vars...
mongodb 12:37:20.11 INFO  ==> Initializing MongoDB Sharded...
mongodb 12:37:20.13 INFO  ==> Writing keyfile for replica set authentication...
mongodb 12:37:20.14 INFO  ==> Enabling authentication...
mongodb 12:37:20.15 INFO  ==> Deploying MongoDB Sharded with persisted data...
mongodb 12:37:20.16 INFO  ==> Trying to connect to MongoDB server mongo1-mongodb-sharded...
mongodb 12:37:20.16 INFO  ==> Found MongoDB server listening at mongo1-mongodb-sharded:27017 !
mongodb 12:37:20.27 INFO  ==> MongoDB server listening and working at mongo1-mongodb-sharded:27017 !
mongodb 12:37:21.58 INFO  ==> Joining the shard cluster
mongodb 13:57:11.75 ERROR ==> Unable to join the sharded cluster
mongodb 13:57:11.75 INFO  ==> Stopping MongoDB..

macpak commented 4 years ago

Performed another test, this time using my local cluster (minikube). Everything looks ok. So it seems the cluster has a problem when run on AKS.

javsalgar commented 4 years ago

Hi,

This is strange, we test AKS clusters daily and we found no issues. Could it be because of the persistence technology you are using? Does it work when not setting persistence? (just for finding the root cause of the issue)

DSC390 commented 3 years ago

Hi, I am currently expiriencing the same Problem. I have also tried to install a older version of the chart with the same result.

the shard is after a view hours still joining

 ==> ** Starting MongoDB Sharded setup **
 ==> Validating settings in MONGODB_* env vars...
 ==> Initializing MongoDB Sharded...
 ==> Writing keyfile for replica set authentication...
==> Enabling authentication...
 ==> Deploying MongoDB Sharded with persisted data...
 ==> Trying to connect to MongoDB server mongosharding-mongodb-sharded...
 ==> Found MongoDB server listening at mongosharding-mongodb-sharded:27017 !
 ==> MongoDB server listening and working at mongosharding-mongodb-sharded:27017 !
 ==> Joining the shard cluster

--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("XXX")
  }
  shards:
  active mongoses:
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }

I have abosulutely no clou what I could try

Edit: Sorry I forgot to mention I am also using AKS Edit2: whenever I install the chart there is a restart of the shard pods Edit3: I now got it running I used the command

helm install mongotest azure-marketplace/mongodb-shared

And now it works

javsalgar commented 3 years ago

Hi,

Thanks for letting us know. Which is the difference between this second attempt and the initial one? Maybe a different chart version?

darnfish commented 3 years ago

Having this issue on DOKS too :~(

darnfish commented 3 years ago

Nevermind I'm stupid

yuswitayudi commented 3 years ago

Nevermind I'm stupid

Hello @darnfish , do you solved this issue?

darnfish commented 3 years ago

Nevermind I'm stupid

Hello @darnfish , do you solved this issue?

Yeah I was able to solve it

skyecool commented 3 years ago

 01:01:32.00 INFO  ==> Setting node as primary
mongodb 01:01:32.03 
mongodb 01:01:32.03 Welcome to the Bitnami mongodb-sharded container
mongodb 01:01:32.03 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mongodb-sharded
mongodb 01:01:32.04 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mongodb-sharded/issues
mongodb 01:01:32.04 
mongodb 01:01:32.04 INFO  ==> ** Starting MongoDB Sharded setup **
mongodb 01:01:32.07 INFO  ==> Validating settings in MONGODB_* env vars...
mongodb 01:01:32.09 INFO  ==> Initializing MongoDB Sharded...
mongodb 01:01:32.12 INFO  ==> Writing keyfile for replica set authentication...
mongodb 01:01:32.14 INFO  ==> Enabling authentication...
mongodb 01:01:32.15 INFO  ==> Deploying MongoDB Sharded with persisted data...
mongodb 01:01:32.17 INFO  ==> Trying to connect to MongoDB server mongodb-sharded...
mongodb 01:01:32.18 INFO  ==> Found MongoDB server listening at mongodb-sharded:27017 !
mongodb 01:01:32.36 INFO  ==> MongoDB server listening and working at mongodb-sharded:27017 !
mongodb 01:01:33.86 INFO  ==> Joining the shard cluster
mongodb 02:21:23.97 ERROR ==> Unable to join the sharded cluster
mongodb 02:21:24.01 INFO  ==> Stopping MongoDB...

skyecool commented 3 years ago

NAME                                      READY   STATUS    RESTARTS   AGE
mongodb-sharded-configsvr-0               1/1     Running   0          16d
mongodb-sharded-configsvr-1               1/1     Running   0          16d
mongodb-sharded-mongos-75fbcfb584-gvkzf   1/1     Running   3          16d
mongodb-sharded-mongos-75fbcfb584-x7xqq   1/1     Running   3          16d
mongodb-sharded-shard0-data-0             1/1     Running   306        16d
mongodb-sharded-shard1-data-0             1/1     Running   307        16d

javsalgar commented 3 years ago

Hi! Could you launch it with BITNAMI_DEBUG=true to see more information about the issue?

basneder commented 3 years ago

The issue for me was; the 'shardN' instance were only listening at localhost, therefor the configsrv was unable to connect to them. I think there are cases where the '/db' directory is already created, but somehow the pod gets destroyed, and now the rewrite config step is skipped. Which makes joining the sharded cluster fail.

See https://github.com/bitnami/bitnami-docker-mongodb-sharded/blob/master/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh (currently line 67)

It seems "mongodb_set_listen_all_conf" should always happen before joining the sharded cluster?

javsalgar commented 3 years ago

Looking at the function, at the end it always sets the listen address to all. Shouldn't that be enough? Not sure if I'm missing something.

basneder commented 3 years ago

The 'listen to all addresses' should be enabled when it performs the 'join shard in cluster'. With the current flow it is possible the 'join shard in cluster' happens when it is only listening to localhost, causing the timeouts.

This is/was the fix which worked for me during a test. Although proving something doesn't happen can be difficult.

diff --git a/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh b/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
index b04ace6..ae9bb00 100644
--- a/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
+++ b/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
@@ -78,6 +78,7 @@ mongodb_sharded_mongod_initialize() {
         mongodb_set_replicasetmode_conf
     fi

+    mongodb_set_listen_all_conf
     if [[ "$MONGODB_SHARDING_MODE" = "shardsvr" ]] && [[ "$MONGODB_REPLICA_SET_MODE" = "primary" ]]; then
         mongodb_wait_for_node "$MONGODB_MONGOS_HOST" "$MONGODB_MONGOS_PORT_NUMBER" "root" "$MONGODB_ROOT_PASSWORD"
         if ! mongodb_sharded_shard_currently_in_cluster "$MONGODB_REPLICA_SET_NAME"; then
@@ -86,7 +87,6 @@ mongodb_sharded_mongod_initialize() {
           info "Shard already in cluster"
         fi
     fi
-    mongodb_set_listen_all_conf
 }

pablogalegoc commented 3 years ago

@basneder could you share the configuration to reproduce the issue? Version of the chart and the values that differ from the default ones. Thanks!

basneder commented 3 years ago

Using version of the chart: 3.4.7

The changed settings from the default are minimal, just a dev/test setup with 2 shards.

configsvr:
  replicas: 1
mongodbRootPassword: XXX
service:
  nodePort: 30000
  type: NodePort
shards: 2
shardsvr:
  dataNode:
    replicas: 1
    resources:
      limits:
        memory: 512Mi
  persistence:
    size: 25Gi

From examining the logs, the following happens,

The shards are started (having persistent database setup), first time:

The config file is updated to 'bindAll' @ line 67
- The script goes waiting for 'mongodb_sharded_join_shard_cluster' @ line 84 https://github.com/bitnami/bitnami-docker-mongodb-sharded/blob/master/4.4/debian-10/rootfs/opt/bitnami/scripts/libmongodb-sharded.sh
This takes too long for K8s and it fails a liveness check. the pod will be reaped (and will be restarted)
Pod is restarted
Now the initial steps are skipped (including updating the config with the 'bindAll' directive'), because the directory '/db' already exists @ line 53. The crucial skipped line is @ 67. This means the second run will only listen to localhost.
It will now try and join the shard cluster with only localhost enabled @ line 84, and this will never succeed, because the joining of the shard cluster requires to have external connectivity available

carrodher commented 3 years ago

Thanks for the detailed explanation, this explains why it is happening from time to time depending on the k8s cluster and/or the values set for the probes. Would you like to send a PR improving the current logic with the fix you implemented?

basneder commented 3 years ago

I have created: https://github.com/bitnami/bitnami-docker-mongodb-sharded/pull/24

This is my first pull request, please let me know if anything is lacking

carrodher commented 3 years ago

Thanks! The team will review it and provide feedback, usually within a business day

jamiesone commented 1 year ago

Hi, I still have this exact same issue, using chart=mongodb-sharded-4.0.21... Do you have any ideas how we could solve this? Thanks for your help!

ostlerc commented 10 months ago

A chart upgrade seems to solve this issue.

bitnami / charts

bitnami/mongodb-sharded - unable to create db #1761