cassinyio / SwarmSpawner

This repo is deprecated. A spawner for JupyterHub
BSD 3-Clause "New" or "Revised" License
23 stars 36 forks source link

Docker Service not found #11

Closed wakonp closed 7 years ago

wakonp commented 7 years ago

Hi, I tried out your SwarmSpawner, extending the Deploy Docker Repo and I quickly ran into some problems.

First of all, your example jupyter_config.py file must be updated on line 35, where you still use string args. Coming to my main problem, after install SwarmSpawner and launching the Jupyterhub as a Service in Docker Swarm everything seems to be okay. But when I login I get some weird errors which look like this:

JupyterHub is now running at http://127.0.0.1:443/ Getting Docker service 'jupyter-3b4bd7fa0504fe5d059e1b0f2655c56b-1' Docker service 'jupyter-3b4bd7fa0504fe5d059e1b0f2655c56b-1' is gone Docker service not found user_options: Getting Docker service 'jupyter-3b4bd7fa0504fe5d059e1b0f2655c56b-1' Docker service 'jupyter-3b4bd7fa0504fe5d059e1b0f2655c56b-1' is gone Unhandled error starting wakonp's server: 500 Server Error: Internal Server Error ("rpc error: code = 3 desc = invalid cpu value 1e-06: Must be at least 0.001") Getting Docker service 'jupyter-3b4bd7fa0504fe5d059e1b0f2655c56b-1' Docker service 'jupyter-3b4bd7fa0504fe5d059e1b0f2655c56b-1' is gone Docker service not found Uncaught exception POST /hub/login?next= (10.255.0.2)

This error tells me two things. The JupyterHub Service can't find the generated NotebookService and that my server has an internal server error because of an invalid cpu value. Lets check out my jupyterhub_config.py file.

import os 
import subprocess
import os
import errno
import stat

c = get_config()
pwd = os.path.dirname(__file__)

c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 443
c.JupyterHub.hub_ip = '0.0.0.0'

c.JupyterHub.spawner_class = 'cassinyspawner.SwarmSpawner'
c.JupyterHub.cleanup_servers = False

c.SwarmSpawner.start_timeout = 60 * 5
c.SwarmSpawner.jupyterhub_service_name = 'jupyterhub_jupyterhub'

c.SwarmSpawner.use_user_options = False
c.SwarmSpawner.networks = ["jupyterhub-network"]
notebook_dir = os.environ.get('NOTEBOOK_DIR') or '/home/jovyan/work'
c.SwarmSpawner.notebook_dir = notebook_dir
mounts = [{'type' : 'volume',
           'source' : 'jupyterhub-user-{username}',
           'target' : notebook_dir}]       
c.SwarmSpawner.container_spec = {
                  'args' : ['start-singleuser.sh'],
                  'Image' :'jupyter/scipy-notebook:bb222f49222e',
                  'mounts' : mounts
          }
c.SwarmSpawner.resource_spec = {
                'cpu_limit' : 1000, # (int) – CPU limit in units of 10^9 CPU shares.
                'mem_limit' : int(512 * 1e6), # (int) – Memory limit in Bytes.
                'cpu_reservation' : 1000, # (int) – CPU reservation in units of 10^9 CPU shares.
                'mem_reservation' : int(512 * 1e6), # (int) – Memory reservation in Bytes
                }

As you can see I use the your config for c.SwarmSpawner.resouce_spec and I set c.SwarmSpawner.use_user_options explicit to false. So I dont understand where the error Internal Server Error ("rpc error: code = 3 desc = invalid cpu value 1e-06: Must be at least 0.001") comes from.

Some other interesting files: Dockerfile.Jupyterhub

# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
FROM jupyterhub/jupyterhub-onbuild:0.7.2

# Install dockerspawner and its dependencies
#    oauthenticator==0.5.* \
# dockerspawner==0.7.* \
RUN /opt/conda/bin/pip install \
    jupyterhub-ldapauthenticator

RUN git clone https://github.com/cassinyio/SwarmSpawner
WORKDIR ./SwarmSpawner
RUN pip install -r requirements.txt
RUN python setup.py install
WORKDIR ..

# install docker on the jupyterhub container
RUN wget https://get.docker.com -q -O /tmp/getdocker && \
    chmod +x /tmp/getdocker && \
    sh /tmp/getdocker

# Copy TLS certificate and key
ENV SSL_CERT /srv/jupyterhub/secrets/jupyterhub.crt
ENV SSL_KEY /srv/jupyterhub/secrets/jupyterhub.key
COPY ./secrets/*.crt $SSL_CERT
COPY ./secrets/*.key $SSL_KEY
RUN chmod 700 /srv/jupyterhub/secrets && \
    chmod 600 /srv/jupyterhub/secrets/*

COPY ./userlist /srv/jupyterhub/userlist

EXPOSE 8080
EXPOSE 8081

Docker-compose.yml (I run it with docker stack deploy -c docker-compose.yml jupyterhub, so the service name of Jupyterhub will be jupyterhub_jupyterhub)

version: "3.2"

services:
  jupyterhub:
    build:
      context: .
      dockerfile: Dockerfile.jupyterhub
    image: walki12/jupyterhub
    container_name: jupyterhub
    volumes:
      - type: bind
        source: /var/run/docker.sock
        target: /var/run/docker.sock
      - type: volume
        source: data
        target: /data
        volume: 
          nocopy: true
    ports:
      - "443:443"
    env_file:
      - .env
    deploy:
      mode: replicated
      replicas: 1
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 120s
      update_config:
        parallelism: 1
        delay: 10s
        failure_action: continue
        monitor: 60s
        max_failure_ratio: 0.3
    command: >
      jupyterhub -f /srv/jupyterhub/jupyterhub_config.py

volumes:
  data:
    external:
      name: jupyterhub-data

networks:
  default:
    external: 
      name: jupyterhub-network

The Volume jupyterhub-data is available and the jupyterhub-network is an attachable overlay network.

Docker version 17.06.0-ce, build 02c1d87

I don't understand why this configuration does not work correctly and I really appreciate some decent explanation for this scenario.

wakonp commented 7 years ago

Short Update: I completely removed the c.SwarmSpawner.resource_spec object in the jupyterhub-config.py and a new Error appears.

Getting Docker service 'notebook-3b4bd7fa0504fe5d059e1b0f2655c56b-1'
Docker service 'notebook-3b4bd7fa0504fe5d059e1b0f2655c56b-1' is gone
ocker service not found
user_options:
Getting Docker service 'notebook-3b4bd7fa0504fe5d059e1b0f2655c56b-1'
Docker service 'notebook-3b4bd7fa0504fe5d059e1b0f2655c56b-1' is gone
Unhandled error starting wakonp's server: module 'docker.auth' has no attribute 'get_config_header'
Getting Docker service 'notebook-3b4bd7fa0504fe5d059e1b0f2655c56b-1'
Docker service 'notebook-3b4bd7fa0504fe5d059e1b0f2655c56b-1' is gone
Docker service not found
Uncaught exception POST /hub/login?next= (::ffff:10.255.0.2)
    HTTPServerRequest(protocol='https', host='10.15.202.10', method='POST', uri='/hub/login?next=', version='HTTP/1.1', remote_ip='::ffff:10.255.0.2', headers={'X-Forwarded-For': '::ffff:10.255.0.2', 'X-Forwarded-Port': '443', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/603.2.4 (KHTML, like Gecko) Version/10.1.1 Safari/603.2.4', 'Dnt': '1', 'X-Forwarded-Host': '10.15.202.10', 'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Encoding': 'gzip, deflate', 'Referer': 'https://10.15.202.10/hub/login', 'Connection': 'close', 'X-Forwarded-Proto': 'https', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Content-Length': '35', 'Accept-Language': 'de-at', 'Origin': 'https://10.15.202.10', 'Host': '10.15.202.10'})
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.5/site-packages/tornado/web.py", line 1469, in _execute
        result = yield result
      File "/opt/conda/lib/python3.5/site-packages/jupyterhub/handlers/login.py", line 84, in post
        yield self.spawn_single_user(user)
      File "/opt/conda/lib/python3.5/site-packages/jupyterhub/handlers/base.py", line 328, in spawn_single_user
        yield gen.with_timeout(timedelta(seconds=self.slow_spawn_timeout), f)
      File "/opt/conda/lib/python3.5/site-packages/jupyterhub/user.py", line 261, in spawn
        raise e
      File "/opt/conda/lib/python3.5/site-packages/jupyterhub/user.py", line 229, in spawn
        ip_port = yield gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
      File "/opt/conda/lib/python3.5/site-packages/swarmspawner-0.1-py3.5.egg/cassinyspawner/swarmspawner.py", line 309, in start
        networks=networks)
      File "/opt/conda/lib/python3.5/concurrent/futures/_base.py", line 398, in result
        return self.__get_result()
      File "/opt/conda/lib/python3.5/concurrent/futures/_base.py", line 357, in __get_result
        raise self._exception
      File "/opt/conda/lib/python3.5/concurrent/futures/thread.py", line 55, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/conda/lib/python3.5/site-packages/swarmspawner-0.1-py3.5.egg/cassinyspawner/swarmspawner.py", line 180, in _docker
        return m(*args, **kwargs)
      File "/opt/conda/lib/python3.5/site-packages/docker/utils/decorators.py", line 34, in wrapper
        return f(self, *args, **kwargs)
      File "/opt/conda/lib/python3.5/site-packages/docker/api/service.py", line 101, in create_service
        auth_header = auth.get_config_header(self, registry)
    AttributeError: module 'docker.auth' has no attribute 'get_config_header'

So I assume this config entry is needed :D

wakonp commented 7 years ago

Another Update:

I fixed the problem by uninstalling docker-py and reinstalling docker with pip in my Dockerfile.juypterhub.

Dockerfile Jupyterhub

# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
FROM jupyterhub/jupyterhub-onbuild:0.7.2

# Install dockerspawner and its dependencies
#    oauthenticator==0.5.* \
RUN /opt/conda/bin/pip install \
    dockerspawner==0.7.* \
    jupyterhub-ldapauthenticator

RUN git clone https://github.com/cassinyio/SwarmSpawner
WORKDIR ./SwarmSpawner
RUN pip install -r requirements.txt
RUN python setup.py install
WORKDIR ..
#For Higher Version

# install docker on the jupyterhub container
RUN wget https://get.docker.com -q -O /tmp/getdocker && \
    chmod +x /tmp/getdocker && \
    sh /tmp/getdocker

# Copy TLS certificate and key
ENV SSL_CERT /srv/jupyterhub/secrets/jupyterhub.crt
ENV SSL_KEY /srv/jupyterhub/secrets/jupyterhub.key
COPY ./secrets/*.crt $SSL_CERT
COPY ./secrets/*.key $SSL_KEY
RUN chmod 700 /srv/jupyterhub/secrets && \
    chmod 600 /srv/jupyterhub/secrets/*

COPY ./userlist /srv/jupyterhub/userlist
RUN pip uninstall --yes docker docker-py ; pip install docker
EXPOSE 8080
EXPOSE 8081

jupyterhub_config.py

# Configuration file for JupyterHub
import os
import subprocess
import os
import errno
import stat

c = get_config()
pwd = os.path.dirname(__file__)

# TLS config
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 443
c.JupyterHub.hub_ip = '0.0.0.0'

c.JupyterHub.spawner_class = 'cassinyspawner.SwarmSpawner'
c.JupyterHub.cleanup_servers = False
c.SwarmSpawner.start_timeout = 60 * 10
c.SwarmSpawner.jupyterhub_service_name = 'jupyterhub_jupyterhub'
c.SwarmSpawner.service_prefix = "jupyterhub"
c.SwarmSpawner.networks = ["jupyterhub-network"]
notebook_dir = os.environ.get('NOTEBOOK_DIR') or '/home/jovyan/work'
c.SwarmSpawner.notebook_dir = notebook_dir  
c.SwarmSpawner.container_spec = {
            'args' : ['start-singleuser.sh'],
            'Image' :'jupyter/scipy-notebook:bb222f49222e',
            'mounts' : [{'type' : 'volume',
            'source' : 'jupyterhub-user-{username}',
            'target' : '/home/jovyan/work'}]
          }

c.SwarmSpawner.resource_spec = {}
#c.SwarmSpawner.resource_spec = {
#                'cpu_limit' : 1000, 
#                'mem_limit' : int(512 * 1e6),
#                'cpu_reservation' : 1000, 
#                'mem_reservation' : int(512 * 1e6)
#                }

But I still can not define the resource_spec.

barrachri commented 7 years ago

So I dont understand where the error Internal Server Error ("rpc error: code = 3 desc = invalid cpu value 1e-06: Must be at least 0.001") comes from.

The problem is the cpu limit you set inside the resource_spec. That should be at least 0.001.

wakonp commented 7 years ago

But my c.SwarmSpawner.resource_spec was set to (like your example config tells me to do):

c.SwarmSpawner.resource_spec = {
                'cpu_limit' : 1000, 
                'mem_limit' : int(512 * 1e6),
                'cpu_reservation' : 1000, 
                'mem_reservation' : int(512 * 1e6)
                }

And I think 1000 is at least 0.001.

barrachri commented 7 years ago

And I think 1000 is at least 0.001.

No is not. Because is 1000 / 10e9.

I need to update the docs.....there should be an issue related to this but PRs are welcome :)

barrachri commented 7 years ago

I think this is resolved :)