arunoda / meteor-up-legacy

Production Quality Meteor Deployments
MIT License
2.26k stars 395 forks source link

mupx - MAJOR bug #761

Open ffxsam opened 8 years ago

ffxsam commented 8 years ago

We've been having a very serious bug in production, where randomly, the Docker link to mongodb just goes away, causing the container to get stuck in an endless "restarting" loop. docker logs shows the following:

=> Starting meteor app on port:80

/bundle/bundle/programs/server/node_modules/fibers/future.js:278
                        throw(ex);
                              ^
Error: failed to connect to [mongodb:27017]
    at Object.Future.wait (/bundle/bundle/programs/server/node_modules/fibers/future.js:398:15)
    at new MongoConnection (packages/mongo/mongo_driver.js:213:1)
    at new MongoInternals.RemoteCollectionDriver (packages/mongo/remote_collection_driver.js:4:1)
    at Object.<anonymous> (packages/mongo/remote_collection_driver.js:38:1)
    at Object.defaultRemoteCollectionDriver (packages/underscore/underscore.js:750:1)
    at new Mongo.Collection (packages/mongo/collection.js:102:1)
    at AccountsServer.AccountsCommon (accounts_common.js:23:18)
    at new AccountsServer (accounts_server.js:16:5)
    at Package (globals_server.js:5:12)
    at /bundle/bundle/programs/server/packages/accounts-base.js:1814:4
    - - - - -
    at [object Object].<anonymous> (/bundle/bundle/programs/server/npm/npm-mongo/node_modules/mongodb/lib/mongodb/connection/server.js:556:74)
    at [object Object].emit (events.js:106:17)
    at [object Object].<anonymous> (/bundle/bundle/programs/server/npm/npm-mongo/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:156:15)
    at [object Object].emit (events.js:98:17)
    at Socket.<anonymous> (/bundle/bundle/programs/server/npm/npm-mongo/node_modules/mongodb/lib/mongodb/connection/connection.js:534:10)
    at Socket.emit (events.js:95:17)
    at net.js:834:16
    at process._tickCallback (node.js:448:13)

It seems the Docker link to mongodb container breaks for some reason. Why does this keep happening? The only workaround I've found is to remove the container completely and do a mupx setup && mupx deploy to build from scratch.

MasterJames commented 8 years ago

I recently put a PR for Mongo backup and restore. It seems that the Mongo container mounts a volume from /var/lib/mongodb in there is a mongod.lock file that triggers the Mongodb container startup loop. In the PR I made tools to help remove the lock although you're supposed to do a proper manual check/repair on the database if you do that. It's done that way to preserve the database ultimately but it does cause confusion.

ffxsam commented 8 years ago

Figured it out! It has to do with running more than one application on a server, e.g.:

CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS              PORTS                        NAMES
af06637addfe        meteorhacks/meteord:base   "/bin/sh -c 'bash $ME"   9 minutes ago       Up 7 minutes        0.0.0.0:3001->80/tcp         myapp
267a280d9a12        mongo                      "/entrypoint.sh mongo"   9 minutes ago       Up 7 minutes        127.0.0.1:27017->27017/tcp   mongodb
89eeca62be1e        meteorhacks/meteord:base   "/bin/sh -c 'bash $ME"   20 hours ago        Restarting (8) 27 seconds ago        0.0.0.0:3000->80/tcp         myapp2

Everything was totally fine when it was just a single Meteor app and the mongodb container. Now that I added a second one, one keeps restarting in a loop with the error above. The only workaround I've found is to run (in this example) mupx setup && mupx deploy for myapp2 to re-deploy it.

@arunoda Any idea what's causing this? This seems like a serious issue.

seigenbrode commented 8 years ago

@ffxsam : Running into the same issue, curious if you ever found a more permanent fix?

ffxsam commented 8 years ago

Nope, never got a fix for this. And I imagine Arunoda has his hands full with other stuff (Mantra, moving stuff from Atmosphere to NPM, etc). I suppose the best workaround would be to have a separate Mongo server, and run multiple Meteor apps on another server, setting MONGO_URL accordingly.

sungwoncho commented 8 years ago

I had a similar error. For me, mongodb container was not starting and docker logs revealed that an old lock file was preventing mongo from starting. For my server, I had to remove /var/lib/mongodb/mongod.lock to get it working again.

MasterJames commented 8 years ago

My MongoDB Backup and Restore PR https://github.com/arunoda/meteor-up/pull/736 that was never merged (along with many others), addressed that issue. I think "mupx setup" had the same effect but it wipes the database.