ml-tooling / ml-hub

🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.
Apache License 2.0
301 stars 64 forks source link

Using --hostname <domain> argument of docker/docker-compose leads to config error #11

Open herrfeder opened 4 years ago

herrfeder commented 4 years ago

Describe the bug:

Using the --hostname argument of docker/docker-compose will lead to the following error

[E 2020-06-16 16:33:50.626 JupyterHub app:2718]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/app.py", line 2715, in launch_instance_async                                                          await self.initialize(argv)
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/app.py", line 2238, in initial
self.load_config_file(self.config_file)
File "<decorator-gen-5>", line 2, in load_config_file
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 602, in load_config_file
raise_config_file_errors=self.raise_config_file_errors,
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 563, in _load_config_files
config = loader.load_config()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/loader.py", line 457, in load_config                                                            self._read_file_as_dict()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/loader.py", line 489, in _read_file_as_dict                                                     py3compat.execfile(conf_filename, namespace)
File "/usr/local/lib/python3.6/dist-packages/ipython_genutils/py3compat.py", line 198, in execfile                                                            exec(compiler(f.read(), fname, 'exec'), glob, loc)
File "/resources/jupyterhub_config.py", line 188, in <module>
container = docker_client.containers.list(filters={"id": socket.gethostname()})[0]
IndexError: list index out of range  

Without setting hostname and without error:

# docker exec -it mlhub /bin/bash
root@4f1f287080b5:/# python
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.gethostname()
'4f1f287080b5' 

With setting hostname and raising the error above:

# docker exec -it mlhub /bin/bash
root@hub:/# python
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.gethostname()
'hub.local.domain.de' 

In the first case the container ID is similiar to the hostname, therefore there is an element in the resulting list. In the second case the container ID is different from the hostname, therefore the resulting list is empty and indexing the first element will give the resulting error.

I guess this approach is generic and necessary for identifying the current jupyterhub container?! I don't understand the purpose completely.

Expected behaviour:

I would expect, the success of execution of the jupyterhub config should be independent from setting an additional hostname.

Reproduce the Bug:

Start the ml-hub container with hostname option:

docker run --rm --network rp_backend --hostname hub.local.domain.de -v /var/run/docker.sock:/var/run/docker.sock --name mlhub  ml_hub:latest 

or set hostname: field in docker-compose:

Technical details:

Possible Fix:

  1. Potential Option: Ignore block completely
docker_client = utils.init_docker_client(client_kwargs, tls_config)
try:
        pass
        #container = docker_client.containers.list(filters={"id": socket.gethostname})[0]
        #container_name = socket.gethostname()
        #if container_name.lower() != ENV_HUB_NAME:
            #container_name.rename(ENV_HUB_NAME.lower())
except docker.errors.APIError as e: 
        logger.error("Could not correctly start MLHub container. " + str(e))
        os.kill(os.getpid(), signal.SIGTERM)   
  1. Potential Option: Use another method of getting the container ID from inside the container, like reading and grepping the /proc/... info