Closed IanButterworth closed 8 years ago
I should add that I added google oauth for authentication
If you don't even see the index.html
, it means that the engine (tornado) is not reachable from webserver (nginx). The FQDN must be accessible from all places, including from within the docker containers.
engineinteractive_err.log and the other two logs are giving me:
Traceback (most recent call last):
File "/jboxengine/src/jbapi.py", line 17, in <module>
JBoxCfg.read(conf_file, user_conf_file)
File "/jboxengine/src/juliabox/jbox_util.py", line 186, in read
with open(arg) as f:
IOError: [Errno 2] No such file or directory: '/jboxengine/conf/tornado.conf'
jbox_configure.sh
generates tornado.conf
.
You possibly forgot to run it?
I did run the script [edit: including as sudo] in the install sequence, and just tried again and tornado.conf
hasn't been created.
I wondered if that comma at the end of the example google auth settings might be rogue, so I tried without but the file still isn't created.
{
"numdisksmax" : 30, # max disks (more than sessions to allow for transitions)
"admin_users" : ['admin@gmail.com'], # administrator email id
"websocket_protocol" : "ws",
"interactive": {
"numlocalmax": 20 # max concurrent users to support
},
"plugins": [
"juliabox.plugins.compute_singlenode",
"juliabox.plugins.vol_loopback",
"juliabox.plugins.vol_defpkg",
"juliabox.plugins.auth_google",
"juliabox.plugins.db_sqlite3"
],
"google_oauth": {
"key": "replace with google oauth key",
"secret": "replace with google oauth secret"
},
}
Curiously I tried copying the template tornado.conf
file from ~/JuliaBox/engine/conf/tornado.conf
to /jboxengine/conf/tornado.conf
and I still get the following error being created in engineinteractive_err.log
Traceback (most recent call last):
File "/jboxengine/src/jbox.py", line 13, in <module>
JBoxCfg.read(conf_file, user_conf_file)
File "/jboxengine/src/juliabox/jbox_util.py", line 186, in read
with open(arg) as f:
IOError: [Errno 2] No such file or directory: '/jboxengine/conf/tornado.conf'
Probing inside jbox_configure.sh
the variable $ENGINE_CONF_DIR
(dictates the save location for tornado.conf) is returning as ~/JuliaBox/engine/conf
There seems to be a location mismatch, but I don't know why the above didn't solve it..
I just tried a fresh install, but modified jbox_configure.sh
to save tornado.conf
to /jboxengine/conf
I now get a different error with the same html result as the original post.
Traceback (most recent call last):
File "/jboxengine/src/jbox.py", line 16, in <module>
JBox().run()
File "/jboxengine/src/juliabox/srvr_jbox.py", line 31, in __init__
VolMgr.configure()
File "/jboxengine/src/juliabox/vol/volmgr.py", line 18, in configure
JBoxVol.configure()
File "/jboxengine/src/juliabox/vol/jbox_volume.py", line 128, in configure
plugin.configure()
File "/jboxengine/src/juliabox/plugins/vol_loopback/loopback.py", line 26, in configure
JBoxLoopbackVol.refresh_disk_use_status()
File "/jboxengine/src/juliabox/plugins/vol_loopback/loopback.py", line 58, in refresh_disk_use_status
container_id_list = [cdesc['Id'] for cdesc in SessContainer.session_containers(allcontainers=True)]
File "/jboxengine/src/juliabox/jbox_container.py", line 86, in session_containers
name = c["Names"][0] if (("Names" in c) and (c["Names"] is not None)) else c["Id"][0:12]
IndexError: list index out of range
@ianshmean All configuration files are packaged into the docker containers and referred from within. Only jbox.user
is (optionally) loaded from the host.
I think there was no need to modify $ENGINE_CONF_DIR
. Instead repackaging the docker images with img_create.sh jbox
would have helped.
I'm definitely not particularly knowledgeable of how things fit together here @tanmaykm, apologies for the wild attempts.
When do you recommend using img_create.sh jbox
. I currently do that during step 4 in the installation process.
Yes, that's the step. You run that again have the webserver and engine configurations packaged/repackaged into the docker images.
@ianshmean assuming this is working now. We can continue discussions if not.
I can recreate this problem on AWS and locally.
Upon browsing to a newly installed server, the browser received the data reported by the OP. The webserver/logs/error.log
log shows:
2015/12/24 14:40:49 [error] 5#0: *1 connect() failed (111: Connection refused), client: 128.16.114.4, server: , request: "GET / HTTP/1.1", host: "mooncalf.medphys.ucl.ac.uk"
2015/12/24 14:40:50 [error] 5#0: *1 connect() failed (111: Connection refused), client: 128.16.114.4, server: , request: "GET / HTTP/1.1", host: "mooncalf.medphys.ucl.ac.uk"
2015/12/24 14:40:51 [warn] 5#0: *1 [lua] router.lua:223: check_forward_addr(): replacing inaccessible forward address http://127.0.0.1:8888 with http://127.0.0.1:8888, client: 128.16.114.4, server: , request: "GET / HTTP/1.1", host: "mooncalf.medphys.ucl.ac.uk"
2015/12/24 14:40:51 [error] 5#0: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 128.16.114.4, server: , request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8888/", host: "mooncalf.medphys.ucl.ac.uk"
The connection cannot be made because, as indicated byhost/run/supervisord.log
, the engineinteractive
container is failing to start:
2015-12-24 14:40:35,099 CRIT Supervisor running as root (no user in config file)
2015-12-24 14:40:35,142 INFO RPC interface 'supervisor' initialized
2015-12-24 14:40:35,142 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-12-24 14:40:35,143 INFO daemonizing the supervisord process
2015-12-24 14:40:35,143 INFO set current directory: '/home/samuelpowell/JuliaBox/host'
2015-12-24 14:40:35,144 INFO supervisord started with pid 10029
2015-12-24 14:40:35,286 INFO spawned: 'webserver' with pid 10032
2015-12-24 14:40:35,289 INFO spawned: 'engineapi' with pid 10033
2015-12-24 14:40:35,292 INFO spawned: 'enginedaemon' with pid 10034
2015-12-24 14:40:35,295 INFO spawned: 'engineinteractive' with pid 10036
2015-12-24 14:40:35,440 INFO exited: engineinteractive (exit status 1; not expected)
2015-12-24 14:40:37,176 INFO success: webserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2015-12-24 14:40:37,176 INFO success: engineapi entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2015-12-24 14:40:37,176 INFO success: enginedaemon entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2015-12-24 14:40:37,178 INFO spawned: 'engineinteractive' with pid 10093
2015-12-24 14:40:37,263 INFO exited: engineinteractive (exit status 1; not expected)
2015-12-24 14:40:39,268 INFO spawned: 'engineinteractive' with pid 10097
2015-12-24 14:40:39,355 INFO exited: engineinteractive (exit status 1; not expected)
2015-12-24 14:40:42,362 INFO spawned: 'engineinteractive' with pid 10101
2015-12-24 14:40:42,450 INFO exited: engineinteractive (exit status 1; not expected)
2015-12-24 14:40:43,452 INFO gave up: engineinteractive entered FATAL state, too many start retries too quickly
The root cause is indicated by /engine/logs/engineinteractive.log
:
Error response from daemon: Could not find container for entity id 817d14ae30e00b0b1550ec69e1c6f25f28160260d7454532d6031fc8e7713245
Error response from daemon: Could not find container for entity id 817d14ae30e00b0b1550ec69e1c6f25f28160260d7454532d6031fc8e7713245
Error response from daemon: Could not find container for entity id 817d14ae30e00b0b1550ec69e1c6f25f28160260d7454532d6031fc8e7713245
Error response from daemon: Could not find container for entity id 817d14ae30e00b0b1550ec69e1c6f25f28160260d7454532d6031fc8e7713245
I do not know enough about Docker to understand what's going on here - but as noted previously, it can/will result from following our installation instructions.
Any ideas?
Thanks for attempting and detailing @samuelpowell
@ianshmean @tanmaykm, following some unrelated notes on various web-pages, I have managed to overcome the problem locally by running the following:
JuliaBox/scripts/run/stop.sh
sudo service docker stop
sudo mv /var/lib/docker/linkgraph.db linkgraph.old
sudo service docker start
JuliaBox/scripts/run/start.sh
I will check this on AWS shortly. I do not know why this is necessary, nor how (un-)safe it is.
Thanks @samuelpowell. Yes, that did look like a corruption of docker images.
I did face that once and fixed it by doing a clean build of all images. That is, I deleted /var/lib/docker
after shutting down docker and re-built containers from scratch.
But maybe the linkgraph.db
was all that needed replacing. Not sure why that happened though, and I did not face it after the rebuild.
As perhaps expected, this also fixes installation on AWS.
@tanmaykm would you like me to add a FAQ to the installation document, noting this, for example?
Sure. That would be useful for others I think. Thanks.
I've followed the install instructions for JuliaBox on a basic AWS ubuntu instance, but the site doesn't load successfully. The FQDN is accessible (and the juliabox favicon loads) but the html only contains the following: