EmergingTechnologyInstitute / acmeair-netflixoss-dockerlocal

acmeair-netflixoss-dockerlocal
Apache License 2.0
33 stars 11 forks source link

microscaler controller Dockerfile sometimes times out #1

Open aspyker opened 10 years ago

aspyker commented 10 years ago

I have seen a failure of:

Step 2 : RUN sudo -u mongodb /usr/bin/mongod -f /etc/mongodb.conf & sleep 5 && /usr/bin/mongo admin --eval "db.addUser('admin','pbJ5mjkRqKLtHCXdhC4656kzX38IPLwa');" && /usr/bin/mongo authdb --eval "db.addUser('auser','qRAHxR5myqRQIaxDAofOCax8PGaibWm0');" && /usr/bin/mongo imdb --eval "db.addUser('imuser','DlodJHjMoiOHY4PZlD4wtPPr2DgLBu9j');" && /usr/bin/mongo asgdb --eval "db.addUser('asguser','MuTJ5POijXmzn52ONw9R0kAvp9NV6Qed');" && sleep 5 && pkill mongod && sleep 5 ---> Running in 4dc7e7fa5407 all output going to: /var/log/mongodb/mongodb.log MongoDB shell version: 2.4.9 connecting to: admin Thu Jun 5 00:07:45.443 Error: couldn't connect to server 127.0.0.1:27017 at src/mongo/shell/mongo.js:145 exception: connect failed 2014/06/05 00:07:46 The command [/bin/sh -c sudo -u mongodb /usr/bin/mongod -f /etc/mongodb.conf & sleep 5 && /usr/bin/mongo admin --eval "db.addUser('admin','pbJ5mjkRqKLtHCXdhC4656kzX38IPLwa');" && /usr/bin/mongo authdb --eval "db.addUser('auser','qRAHxR5myqRQIaxDAofOCax8PGaibWm0');" && /usr/bin/mongo imdb --eval "db.addUser('imuser','DlodJHjMoiOHY4PZlD4wtPPr2DgLBu9j');" && /usr/bin/mongo asgdb --eval "db.addUser('asguser','MuTJ5POijXmzn52ONw9R0kAvp9NV6Qed');" && sleep 5 && pkill mongod && sleep 5] returned a non-zero code: 1

I did:

root@vagrant-ubuntu-trusty-64:~/acmeair-netflixoss-dockerlocal/microscaler# docker run -t -i 2dbeb65e88b7 /bin/bash root@75c5c19f2bf0:/# sudo -u mongodb /usr/bin/mongod -f /etc/mongodb.conf & sleep 5 && /usr/bin/mongo admin --eval "db.addUser('admin','vRJVb36giI794UEM1kwtVbn7hdZaFwyn');" && /usr/bin/mongo authdb --eval "db.addUser('auser','6BY2ghEzwq1F2avmuTSgPSqjiBX8sUOb');" && /usr/bin/mongo imdb --eval "db.addUser('imuser','TYd1ZM2WPRHMc2DIr4VpGW1DunI2h04C');" && /usr/bin/mongo asgdb --eval "db.addUser('asguser','bwDkgwe9pkN3CzVoeZui7HhiXI41rGsj');" && sleep 5 && pkill mongod && sleep 5 [1] 8 all output going to: /var/log/mongodb/mongodb.log MongoDB shell version: 2.4.9 connecting to: admin Thu Jun 5 00:52:20.485 Error: couldn't connect to server 127.0.0.1:27017 at src/mongo/shell/mongo.js:145 exception: connect failed root@75c5c19f2bf0:/# sudo -u mongodb /usr/bin/mongod -f /etc/mongodb.conf & sleep 10 && /usr/bin/mongo admin --eval "db.addUser('admin','vRJVb36giI794UEM1kwtVbn7hdZaFwyn');" && /usr/bin/mongo authdb --eval "db.addUser('auser','6BY2ghEzwq1F2avmuTSgPSqjiBX8sUOb');" && /usr/bin/mongo imdb --eval "db.addUser('imuser','TYd1ZM2WPRHMc2DIr4VpGW1DunI2h04C');" && /usr/bin/mongo asgdb --eval "db.addUser('asguser','bwDkgwe9pkN3CzVoeZui7HhiXI41rGsj');" && sleep 5 && pkill mongod && sleep 5 [2] 25 all output going to: /var/log/mongodb/mongodb.log [2]+ Exit 100 sudo -u mongodb /usr/bin/mongod -f /etc/mongodb.conf MongoDB shell version: 2.4.9 connecting to: admin { "user" : "admin", "readOnly" : false, "pwd" : "1458993b8086bc3f88f11124e6fa7214", "_id" : ObjectId("538fbf800020c6f936e38d0c") } MongoDB shell version: 2.4.9 connecting to: authdb { "user" : "auser", "readOnly" : false, "pwd" : "6e3a64a26b885e47ec8d9ffd8dabf617", "_id" : ObjectId("538fbf804bda905886eafe2c") } MongoDB shell version: 2.4.9 connecting to: imdb { "user" : "imuser", "readOnly" : false, "pwd" : "cdfdd67dc872fec4beea8fe1688d05c0", "_id" : ObjectId("538fbf802120449d0d053435") } MongoDB shell version: 2.4.9 connecting to: asgdb { "user" : "asguser", "readOnly" : false, "pwd" : "6c5680cf7f4e91cb67adb98dbf9f0e12", "_id" : ObjectId("538fbf80f5b07d036fe320aa") }

Increasing the timeout fixed the issue, but I worry on slower laptops that having a 10 second timeout might still not be enough. We need to find a way to test and retry the mongo database.

My guess is this could also affect the Dockerfile for Asgard.

pdettori commented 10 years ago

I saw the same issue in my laptop at least once.

aspyker commented 10 years ago

increasing sleep to 10 seconds in Dockerfile.template doesn't help always for me. going to 60 seconds seems to always work. FWIW, here is the log from mongodb.log on a failed 5 second timeout:

Thu Jun 5 02:08:08.672 [initandlisten] MongoDB starting : pid=8 port=27017 dbpath=/var/lib/mongodb 64-bit host=95699cec7d3e Thu Jun 5 02:08:08.672 [initandlisten] db version v2.4.9 Thu Jun 5 02:08:08.672 [initandlisten] git version: nogitversion Thu Jun 5 02:08:08.672 [initandlisten] build info: Linux orlo 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54 Thu Jun 5 02:08:08.672 [initandlisten] allocator: tcmalloc Thu Jun 5 02:08:08.672 [initandlisten] options: { bind_ip: "127.0.0.1", config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", journal: "true", logappend: "true", logpath: "/var/log/mongodb/mongodb.log" } Thu Jun 5 02:08:08.684 [initandlisten] journal dir=/var/lib/mongodb/journal Thu Jun 5 02:08:08.685 [initandlisten] recover : no journal files present, no recovery needed Thu Jun 5 02:08:08.900 [initandlisten] preallocateIsFaster=true 3.08 Thu Jun 5 02:08:09.107 [initandlisten] preallocateIsFaster=true 2.92 Thu Jun 5 02:08:10.338 [initandlisten] preallocateIsFaster=true 3.02 Thu Jun 5 02:08:10.338 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.0 Thu Jun 5 02:08:13.034 [initandlisten] File Preallocator Progress: 775946240/1073741824 72% Thu Jun 5 02:08:18.625 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.1

Do you can see that it was taking time to allocate the journal file. From what I've seen, sometimes this is fast and sometimes slow (likely Docker file system overlay interactions?)

aspyker commented 10 years ago

and here is the log from one that failed at 10 seconds:

Thu Jun 5 02:11:54.179 [initandlisten] MongoDB starting : pid=9 port=27017 dbpath=/var/lib/mongodb 64-bit host=95699cec7d3e Thu Jun 5 02:11:54.179 [initandlisten] db version v2.4.9 Thu Jun 5 02:11:54.179 [initandlisten] git version: nogitversion Thu Jun 5 02:11:54.179 [initandlisten] build info: Linux orlo 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54 Thu Jun 5 02:11:54.180 [initandlisten] allocator: tcmalloc Thu Jun 5 02:11:54.180 [initandlisten] options: { bind_ip: "127.0.0.1", config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", journal: "true", logappend: "true", logpath: "/var/log/mongodb/mongodb.log" } Thu Jun 5 02:11:54.190 [initandlisten] journal dir=/var/lib/mongodb/journal Thu Jun 5 02:11:54.191 [initandlisten] recover : no journal files present, no recovery needed Thu Jun 5 02:11:54.407 [initandlisten] preallocateIsFaster=true 3.08 Thu Jun 5 02:11:54.590 [initandlisten] preallocateIsFaster=true 2.7 Thu Jun 5 02:11:55.794 [initandlisten] preallocateIsFaster=true 2.82 Thu Jun 5 02:11:55.794 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.0 Thu Jun 5 02:11:58.018 [initandlisten] File Preallocator Progress: 713031680/1073741824 66% Thu Jun 5 02:12:03.341 [initandlisten] preallocating a journal file /var/lib/mongodb/journal/prealloc.1 Thu Jun 5 02:12:06.088 [initandlisten] File Preallocator Progress: 681574400/1073741824 63%

aspyker commented 10 years ago

and finally a 15 second timeout that shows sometimes "zero" seconds are required. This one worked, but it would have worked with a 5 second timeout if you look at the timestamps.

Thu Jun 5 02:14:17.843 [initandlisten] MongoDB starting : pid=9 port=27017 dbpath=/var/lib/mongodb 64-bit host=95699cec7d3e Thu Jun 5 02:14:17.843 [initandlisten] db version v2.4.9 Thu Jun 5 02:14:17.843 [initandlisten] git version: nogitversion Thu Jun 5 02:14:17.843 [initandlisten] build info: Linux orlo 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54 Thu Jun 5 02:14:17.843 [initandlisten] allocator: tcmalloc Thu Jun 5 02:14:17.843 [initandlisten] options: { bind_ip: "127.0.0.1", config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", journal: "true", logappend: "true", logpath: "/var/log/mongodb/mongodb.log" } Thu Jun 5 02:14:17.850 [initandlisten] journal dir=/var/lib/mongodb/journal Thu Jun 5 02:14:17.851 [initandlisten] recover : no journal files present, no recovery needed Thu Jun 5 02:14:17.959 [FileAllocator] allocating new datafile /var/lib/mongodb/local.ns, filling with zeroes... Thu Jun 5 02:14:17.959 [FileAllocator] creating directory /var/lib/mongodb/_tmp Thu Jun 5 02:14:17.962 [FileAllocator] done allocating datafile /var/lib/mongodb/local.ns, size: 16MB, took 0.002 secs Thu Jun 5 02:14:17.962 [FileAllocator] allocating new datafile /var/lib/mongodb/local.0, filling with zeroes... Thu Jun 5 02:14:17.963 [FileAllocator] done allocating datafile /var/lib/mongodb/local.0, size: 64MB, took 0 secs Thu Jun 5 02:14:17.964 [websvr] admin web console waiting for connections on port 28017 Thu Jun 5 02:14:17.966 [initandlisten] waiting for connections on port 27017 Thu Jun 5 02:14:32.790 [initandlisten] connection accepted from 127.0.0.1:38680 #1 (1 connection now open)

aspyker commented 10 years ago

ok, so it seems like sometimes mongo decides to pre-allocate and sometimes not and this is why it sometimes fails:

http://grokbase.com/t/gg/mongodb-user/143k60ercx/mongodb-preallocates-journal-files-issue

Not sure of an answer yet, but at least I can spot why it works quickly sometimes and fails slowly other times.

aspyker commented 10 years ago

seems like we might be able to use this option:

https://jira.mongodb.org/browse/SERVER-2733

tkhinaba commented 10 years ago

I added smallfiles = true in mongodb.conf to avoid large size of files are created at the initial data loading. See smallfiles.