bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.97k stars 9.2k forks source link

[bitnami/mongodb-sharded] Root password creation issue leading to failure to deploy/startup #17042

Closed sethjones closed 8 months ago

sethjones commented 1 year ago

Name and Version

bitnami/mongodb-sharded 6.5.3

What architecture are you using?

amd64

What steps will reproduce the bug?

  1. Install helm chart to new/blank namespace
  2. Set password values
  3. Deploy/Install

Are you using any custom parameters or values?

Set: auth.rootPassword, auth.replicaSetKey ( i have attempted without as well)

I have attempted to work around the issues by extending all available timeouts, without success.

I have also attempted enforcing minimum resources of cpu: 8 and memory: 8Gi to ensure that pods were not being resources starved. No change.

What is the expected behavior?

Sharded deployment is created, and starts up using provided username and passwords.

What do you see instead?

Upon initial creation, configsvr enters this state:

12:58:21.45 INFO  ==> Setting node as primary
mongodb 12:58:21.47 
mongodb 12:58:21.47 Welcome to the Bitnami mongodb-sharded container
mongodb 12:58:21.48 Subscribe to project updates by watching https://github.com/bitnami/containers
mongodb 12:58:21.48 Submit issues and feature requests at https://github.com/bitnami/containers/issues
mongodb 12:58:21.48 
mongodb 12:58:21.48 INFO  ==> ** Starting MongoDB Sharded setup **
mongodb 12:58:21.51 INFO  ==> Validating settings in MONGODB_* env vars...
mongodb 12:58:21.55 INFO  ==> Initializing MongoDB Sharded...
mongodb 12:58:21.57 INFO  ==> Deploying MongoDB Sharded from scratch...
MongoNetworkError: connect ECONNREFUSED 192.168.146.9:27017
mongodb 12:58:34.28 INFO  ==> Creating users...
mongodb 12:58:34.28 INFO  ==> Creating root user...
Current Mongosh Log ID: 647f2d7a0c6174643106d52c
Connecting to:      mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+1.9.0
Using MongoDB:      6.0.6
Using Mongosh:      1.9.0
For mongosh info see: https://docs.mongodb.com/mongodb-shell/
------
   The server generated these startup warnings when booting
   2023-06-06T12:58:21.635+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
   2023-06-06T12:58:23.672+00:00: vm.max_map_count is too low
------
mongodb-sharded-configsvr [direct: secondary] test> Uncaught 
MongoServerError: not primary
mongodb 12:58:35.59 INFO  ==> Users created
mongodb 12:58:35.59 INFO  ==> Writing keyfile for replica set authentication...
mongodb 12:58:35.61 INFO  ==> Enabling authentication...
mongodb 12:58:35.62 INFO  ==> Configuring MongoDB Sharded replica set...
mongodb 12:58:35.63 INFO  ==> Stopping MongoDB...
mongodb 12:58:45.67 INFO  ==> Configuring MongoDB primary node...: mongodb-sharded-configsvr-0.mongodb-sharded-headless.mongodb-sharded.svc.cluster.local
MongoServerError: Authentication failed.
MongoServerError: Authentication failed.
...

Mongos:

mongodb 13:00:10.01 
mongodb 13:00:10.01 Welcome to the Bitnami mongodb-sharded container
mongodb 13:00:10.02 Subscribe to project updates by watching https://github.com/bitnami/containers
mongodb 13:00:10.02 Submit issues and feature requests at https://github.com/bitnami/containers/issues
mongodb 13:00:10.02 
mongodb 13:00:10.02 INFO  ==> ** Starting MongoDB Sharded setup **
mongodb 13:00:10.05 INFO  ==> Validating settings in MONGODB_* env vars...
mongodb 13:00:10.09 INFO  ==> Initializing Mongos...
mongodb 13:00:10.10 INFO  ==> Writing keyfile for replica set authentication...
mongodb 13:00:10.12 INFO  ==> Trying to connect to MongoDB server mongodb-sharded-configsvr-0.mongodb-sharded-headless.mongodb-sharded.svc.cluster.local...
mongodb 13:00:10.13 INFO  ==> Found MongoDB server listening at mongodb-sharded-configsvr-0.mongodb-sharded-headless.mongodb-sharded.svc.cluster.local:27017 !
MongoServerError: Authentication failed.
MongoServerError: Authentication failed.
...

Shards 0&1

 13:00:22.72 INFO  ==> Setting node as primary
mongodb 13:00:22.74 
mongodb 13:00:22.74 Welcome to the Bitnami mongodb-sharded container
mongodb 13:00:22.74 Subscribe to project updates by watching https://github.com/bitnami/containers
mongodb 13:00:22.74 Submit issues and feature requests at https://github.com/bitnami/containers/issues
mongodb 13:00:22.75 
mongodb 13:00:22.75 INFO  ==> ** Starting MongoDB Sharded setup **
mongodb 13:00:22.77 INFO  ==> Validating settings in MONGODB_* env vars...
mongodb 13:00:22.82 INFO  ==> Initializing MongoDB Sharded...
mongodb 13:00:22.85 INFO  ==> Writing keyfile for replica set authentication...
mongodb 13:00:22.86 INFO  ==> Enabling authentication...
mongodb 13:00:22.87 INFO  ==> Deploying MongoDB Sharded with persisted data...
mongodb 13:00:22.89 INFO  ==> Trying to connect to MongoDB server mongodb-sharded...
timeout reached before the port went into state "inuse"
timeout reached before the port went into state "inuse"
...

If I take the additional step to restart the configsvr after initial startup, it will enter a running state and not get stuck in the "Authentication Failed" loop.

After start up the following errors are present from the mongos/shard pods:

{"t":{"$date":"2023-06-06T13:06:41.804+00:00"},"s":"I",  "c":"ACCESS",   "id":20251,   "ctx":"conn30","msg":"Supported SASL mechanisms requested for unknown user","attr":{"user":{"user":"root","db":"admin"}}}
{"t":{"$date":"2023-06-06T13:06:41.804+00:00"},"s":"I",  "c":"ACCESS",   "id":20249,   "ctx":"conn30","msg":"Authentication failed","attr":{"mechanism":"SCRAM-SHA-256","speculative":true,"principalName":"root","authenticationDatabase":"admin","remote":"192.168.146.48:54550","extraInfo":{},"error":"UserNotFound: Could not find user \"root\" for db \"admin\""}}
{"t":{"$date":"2023-06-06T13:06:41.807+00:00"},"s":"I",  "c":"ACCESS",   "id":20249,   "ctx":"conn30","msg":"Authentication failed","attr":{"mechanism":"SCRAM-SHA-1","speculative":false,"principalName":"root","authenticationDatabase":"admin","remote":"192.168.146.48:54550","extraInfo":{},"error":"UserNotFound: Could not find user \"root\" for db \"admin\""}}

Additionally, if I attempt to use the mongosh client on the configsvr I am unable to use the root username and defined password.

It appears that that username creation/modification and set of the root password is not happening.

Additional information

Kubernetes Cluster Information:

Deploying this chart on my desktop via Kind led to a working deployment.

sethjones commented 1 year ago

Likely related: #13364

sethjones commented 1 year ago

I did a bit more troubleshooting tonight.

I attempted to install the chart in another of my clusters, which differs in configuration (but both use rook/ceph).

The same failure occurred.

However, I ran a test in the initial cluster, with persistence off. Everything started up as it should. All pods came up, joined the cluster, and were operating successfully without intervention.

sethjones commented 1 year ago

More troubleshooting: I recreated a new storage class to allow for an xfs file system, based on its recommendation.

Results are the same.

github-actions[bot] commented 1 year ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

sethjones commented 1 year ago

keep open

FraPazGal commented 1 year ago

Hi @sethjones, apologies it took this long to reply; it seems the issue didn't get notify in our board. I've had some problems to reproduce the error but there was an internal task related to https://github.com/bitnami/charts/issues/13364 (which had a fix proposed at https://github.com/bitnami/containers/pull/24938) because users both in the PR and issue went MIA. I'll reopen the task and prioritise it so this can get a proper solution.

I'll put this ticket on-hold so you can get notified about any progress on our side.

sethjones commented 1 year ago

Was this resolved?

I am able to recreate the issue in release v6.6.6.

FraPazGal commented 1 year ago

Hi @sethjones, the team hasn't been able to work on this yet. I have increased the priority of our internal task and move it from the backlog to be selected from development.

If you're interested in contributing a solution to expedite this, we welcome you to create a pull request. The Bitnami team would be happy to review your submission and offer feedback. You can find the contributing guidelines here.

github-actions[bot] commented 1 year ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 1 year ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

MrWormsy commented 11 months ago

Hi I still have the issue, why is this ever happening ?

To bypass the "MongoServerError: Authentication failed." I tried to set the auth.enable value to false but this is doing nothing but a brand new error, why a passwordless sharded cluster is not supported if the corresponding flag exists ?

07:20:25.40 INFO  ==> Setting node as primary
mongodb 07:20:25.42
mongodb 07:20:25.42 Welcome to the Bitnami mongodb-sharded container
mongodb 07:20:25.42 Subscribe to project updates by watching https://github.com/bitnami/containers
mongodb 07:20:25.42 Submit issues and feature requests at https://github.com/bitnami/containers/issues
mongodb 07:20:25.42
mongodb 07:20:25.42 INFO  ==> ** Starting MongoDB Sharded setup **
mongodb 07:20:25.43 INFO  ==> Validating settings in MONGODB_* env vars...
mongodb 07:20:25.43 ERROR ==> The MONGODB_ROOT_PASSWORD environment variable is empty or not set. Set the environment variable ALLOW_EMPTY_PASSWORD=yes to allow the container to be started with blank passwords. This is only recommended for development.
FraPazGal commented 11 months ago

Hello @MrWormsy, we are aware the issue is still present. It seems our automations closed this issue by mistake, but the associated internal task is still on our backlog. Thanks for providing more info, we'll leave this issue opened and notify any advances on our side.

juan131 commented 10 months ago

Hi @sethjones @MrWormsy

I was unable to reproduce the issue on a GKE cluster but, based on the linked issues, it seems this error can only be reproduced when persistence is enabled and the PV StorageClass uses a "slow" filesystem, which isn't my case so I was unable to reproduce it.

As @rafariossaa mentioned on one of the mentioned issued, there's a environment variable (MONGODB_MAX_TIMEOUT) which can be customized setting the common.mongodbMaxWaitTimeout parameter (set by default to 120 seconds). This setting can be used in combination with configsvr.readinessProbe.initialDelaySeconds & shardsvr.dataNode.readinessProbe.initialDelaySeconds to give the initialization logic more time to start Mongo in background, create users, configure the replicaset, etc. You could even disable these probes during the 1st installation to double-check the issue is related with a slow filesystem.

MrWormsy commented 10 months ago

Hi

I've just tried setting the parameters to a value of 3600s but I get the same error. I think the error is indeed due to the fact that my server uses HDD instead of SSD. I have no problem running the chart on my personal computer.

Thank you for your help.

esasidharan commented 8 months ago

Hi all, I guess I have found the root cause of this Authentication failed issue. This happens mostly in slower systems.

Reason or Bug: In the Bitnami script file libmongodb.sh, in mongodb_is_primary_node_up(), there are the following lines of code that check if the MongoDB instance has turned from secondary to primary.

result=$(
        mongodb_execute_print_output "$user" "$password" "admin" "$host" "$port" <<EOF
db.isMaster().ismaster
EOF
    )
    grep -q "true" <<<"$result"

The problem is in the line grep -q "true" <<<"$result". As part of the $result output, MongoDB gives the following connection string which also has the string "true".

"Connecting to:      mongodb://127.0.0.1:27017/admin?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.1.1"

Hence even when it is secondary, i.e. output of db.isMaster().ismaster is false, the grep -q "true" <<<"$result" check passes and it proceeds to create the root user. When it tries to create a root user during the secondary state, the root user creation fails. In faster systems, it turns into primary quickly and this bug in the code doesn't matter.

Changing the grep check into a more specific one like the following helps to check if the mongodb instance turns into primary and then the root user gets created successfully and Authentication passes.

grep -q "\[direct: primary\] admin> true" <<<"$result"

Any Bitnami coordinators, please help to fix and validate this code and commit this update. Also if there are any other issue threads similar to this, please link to this information.

carrodher commented 8 months ago

Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

esasidharan commented 8 months ago

Hi @carrodher, Thanks for your appreciation.

The bug is not in https://github.com/bitnami/charts, but in https://github.com/bitnami/containers. I have created a pull request (PR) in bitnami/containers. PR : https://github.com/bitnami/containers/pull/55910

Please help to take this further.

sethjones commented 8 months ago

As of today, I have test deployed chart v7.6.0 with image 7.0.5-debian-12-r2 and the issue has been resolved. Thanks @esasidharan