sclorg / mongodb-container

MongoDB container images based on Red Hat Software Collections and intended for OpenShift and general usage. Users can choose between Red Hat Enterprise Linux, Fedora, and CentOS based images.
https://softwarecollections.org
Apache License 2.0
50 stars 179 forks source link

After a redeploy, MongoDB replica set is lost, pods become independent mongo instances #70

Closed rhcarvalho closed 9 years ago

rhcarvalho commented 9 years ago

The replication example does not survive a redeploy. The way it works with a run-once pod has likely no future if we will support a redeploy.

Steps to reproduce:

  1. On a new project, create cluster from the template:

    $ oc new-app 2.4/examples/replica/mongodb-clustered.json
    services/mongodb
    pods/mongodb-service
    deploymentconfigs/mongodb
    Service "mongodb" created at None with port mappings 27017.
    Run 'oc status' to view your app.
  2. Wait until replica set is deployed and stand-alone pod shuts down:

    $ oc logs mongodb-service -f
    ...
    => Waiting for MongoDB service shutdown ...
    => MongoDB service has stopped
    => Successfully initialized replSet
  3. List pods and connect to one of them as the 'admin' user:

    $ oc get pods
    NAME              READY     STATUS       RESTARTS   AGE
    mongodb-1-0u9qv   1/1       Running      0          2m
    mongodb-1-g0lma   1/1       Running      0          2m
    mongodb-1-rm4bj   1/1       Running      0          2m
    mongodb-service   0/1       ExitCode:0   0          2m
    $ oc exec -it mongodb-1-0u9qv -- bash -c 'mongo $MONGODB_DATABASE -u admin -p $MONGODB_ADMIN_PASSWORD --authenticationDatabase=admin'
    MongoDB shell version: 2.4.9
    connecting to: userdb
    Welcome to the MongoDB shell.
    For interactive help, type "help".
    For more comprehensive documentation, see
       http://docs.mongodb.org/
    Questions? Try the support group
       http://groups.google.com/group/mongodb-user
    rs0:SECONDARY> 
    bye

    Ok, we have a replica set. Now, let's continue...

  4. Redeploy:

    $ oc deploy --latest mongodb                                                            
    Started deployment #2
  5. Again, list pods and try to connect to one of them as the 'admin' user:

    $ oc get pods
    NAME               READY     STATUS       RESTARTS   AGE
    mongodb-2-deploy   1/1       Running      0          12s
    mongodb-2-e9c66    0/1       Running      0          7s
    mongodb-2-govx7    1/1       Running      0          7s
    mongodb-2-imdo4    1/1       Running      0          7s
    mongodb-service    0/1       ExitCode:0   0          4m
    $ oc exec -it mongodb-2-e9c66 -- bash -c 'mongo $MONGODB_DATABASE -u admin -p $MONGODB_ADMIN_PASSWORD --authenticationDatabase=admin'
    MongoDB shell version: 2.4.9
    connecting to: userdb
    Thu Aug 13 13:59:11.983 Error: 18 { code: 18, ok: 0.0, errmsg: "auth fails" } at src/mongo/shell/db.js:228
    exception: login failed

    It failed because there is no data persistence and with the redeploy all the data an configuration was gone.

  6. Connect without authentication:

    $ oc exec -it mongodb-2-e9c66 -- bash -c 'mongo'
    MongoDB shell version: 2.4.9
    connecting to: test
    Welcome to the MongoDB shell.
    For interactive help, type "help".
    For more comprehensive documentation, see
       http://docs.mongodb.org/
    Questions? Try the support group
       http://groups.google.com/group/mongodb-user
    > 
    bye

    As we can see, we have now an independent MongoDB instance, running without authentication and without any of the configuration applied originally by the mongodb-service pod.

bparees commented 9 years ago

is the issue simply that the config data is not on a persistent volume? if it were, would this survive a restart? or is there something else that needs to be initialized each time the mongo container comes up?

rhcarvalho commented 9 years ago

During redeploy the replica set is essentially undone as pods get killed. Upon restart, they need be reconnected -- https://github.com/openshift/mongodb/blob/master/2.4/contrib/common.sh#L130-L135

I'll try to make it so that we use a post-deploy hook instead of a run-once pod and see how far it gets us. It should work even if we have ephemeral storage (data will be lost, but not the connectivity in the redeployed cluster).

rhcarvalho commented 9 years ago

I think this would be a good scenario for an extended test... or at least add as a test case for our QE team.

wzheng1 commented 9 years ago

@rhcarvalho QE has added such scenario with "origin_devexp_625" contained in case title: https://tcms-openshift.rhcloud.com/case/4101/ and https://tcms-openshift.rhcloud.com/case/4102/ .