Open buhl opened 4 years ago
This looks promising. Would you like to contribute a pull request for this?
I would recommend just going after the ID directly, rather than name first, and we'd probably want protection from the file not existing in the first place.
Thank you for opening this issue!
Hi @kevin-bates Sorry for the late reply, I just returned from a vacation. I will try to get some time to fit this solution into a working example for you to look at.
Right on - no worries. Welcome back!
Hi @kevin-bates
I made some changes to
https://github.com/buhl/enterprise_gateway/blob/master/etc/docker/enterprise-gateway/Dockerfile
and
https://github.com/buhl/enterprise_gateway/blob/master/etc/docker/enterprise-gateway/start-enterprise-gateway.sh
So now the jovyan user is added to the docker group. I have spent the most of two evenings trying to get the enterprise gateway to build and run and I am not all there.
I can now start an enterprise gateway with docker-compose up
, but I cant seem to get the enterprise gateway to work (I get {"reason": "Not Found", "message": ""}
on all requests).
However the jovyan user can talk to the docker daemon as demonstrated below:
enterprise_gateway/etc/docker $ docker-compose exec enterprise-gateway /bin/bash
root@ab7012645bbe:/usr/local/bin# su - jovyan
jovyan@ab7012645bbe:~$ id
uid=1000(jovyan) gid=100(users) groups=100(users),999(docker)
jovyan@ab7012645bbe:~$ ps wwwfaux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
[...]
root 1 0.0 0.0 4520 792 ? Ss 20:42 0:00 tini -g -- /usr/local/bin/start-enterprise-gateway.sh
root 6 0.0 0.0 11588 3132 ? S 20:42 0:00 /bin/bash /usr/local/bin/start-enterprise-gateway.sh
root 18 0.0 0.0 50940 3452 ? S 20:42 0:00 \_ su --preserve-environment jovyan -c /opt/conda/bin/jupyter enterprisegateway .--log-level=DEBUG .--MappingKernelManager.cull_idle_timeout=3600 .--MappingKernelManager.cull_interval=60 .--MappingKernelManager.cull_connected=False
jovyan 19 0.1 0.2 73988 48756 ? Ss 20:42 0:01 \_ /opt/conda/bin/python /opt/conda/bin/jupyter-enterprisegateway --log-level=DEBUG --MappingKernelManager.cull_idle_timeout=3600 --MappingKernelManager.cull_interval=60 --MappingKernelManager.cull_connected=False
jovyan@ab7012645bbe:~$ curl --unix-socket /var/run/docker.sock http://localhost/containers/json
[{"Id":"ab7012645bbee25a3dfa432050b4b96e1d7e0db80619b6cee7715e7909abd596","Names":["/docker_enterprise-gateway_1"],"Image":"elyra/enterprise-gateway:dev","ImageID":"sha256:4d5551596ca09de98fa6372f613a0ed30b9201f1b2c42f3bd8f8de1c348aa8af","Command":"tini -g -- /usr/local/bin/start-enterprise-gateway.sh","Created":1581972148,"Ports":[{"IP":"0.0.0.0","PrivatePort":8888,"PublicPort":8888,"Type":"tcp"}],"Labels":{"app":"enterprise-gateway","com.docker.compose.config-hash":"82055d9f2565b2df47f89c910fefe89c84321473a215b6b755b92b4ed638108f","com.docker.compose.container-number":"1","com.docker.compose.oneoff":"False","com.docker.compose.project":"docker","com.docker.compose.service":"enterprise-gateway","com.docker.compose.version":"1.24.1","component":"enterprise-gateway","maintainer":"Jupyter Project <jupyter@googlegroups.com>"},"State":"running","Status":"Up 16 minutes","HostConfig":{"NetworkMode":"docker_enterprise-gateway"},"NetworkSettings":{"Networks":{"docker_enterprise-gateway":{"IPAMConfig":null,"Links":null,"Aliases":null,"NetworkID":"3b4760bd7fd203154baeca0fcfdd71ea048f1b6a9c5fe870a31842ac09188e67","EndpointID":"f6fad93642d0bf5ff39812d90a3234f3693c70df5be4ae6b52d575746bff0620","Gateway":"172.20.0.1","IPAddress":"172.20.0.2","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"02:42:ac:14:00:02","DriverOpts":null}}},"Mounts":[{"Type":"bind","Source":"/var/run/docker.sock","Destination":"/var/run/docker.sock","Mode":"rw","RW":true,"Propagation":"rprivate"}]}]
jovyan@ab7012645bbe:~$
I feel I have to say that running the service as a not root user in the container. but giving it access to the docker.sock is effectively like giving the jovyan user root on the host machine :) So if this exercise is only about dropping root for no other reason but dropping root it might not be super important.
Well, I don't really know what to do next?
Hi @buhl, sorry for the frustration. I agree, EG is has a non-trivial build.
You do not need to worry about building the demo-base or enterprise-gateway-demo images. Those are purely for demo and YARN integration tests.
The make targets you'll need to invoke are: clean dist enterprise-gateway
- the last of which builds the EG image. Target kernel-images
builds the various kernel-related images, but those shouldn't have to change for this.
I agree with what you say about root and docker.sock. The requirement for operating in docker environments is that EG be able to start images, query running containers based on labels, etc. (discovery) and stop containers via the docker API. My understanding is that "docker in docker" requires docker.sock and mounting docker.sock requires root. So this may just need to be the way things are. Perhaps we just change the FIXME to a nasty WARNING message. :smile:
At any rate, I'm hoping we can get your build working so you're free to check things out and make contributions.
Regarding runtime experiences...
What command are you issuing to produce {"reason": "Not Found", "message": ""}
? I usually hit /api/kernelspecs
as my litmus test that EG is able to service requests. You should get the JSON for each of the found kernelspecs returned.
If you're going through Notebook, then things could be tied up with incorrect socket in your --gateway-url
value or something like that. Does the EG log show anything on each request attempt?
Hi @kevin-bates
Great, thanks! I will try with the make targets
later this week. I actually got enterprise-gateway to build and run. The /api/kernelspecs enpoint you suggested also seem to work, but I have yet to try and start a notebook.
I will try to clean up my branch, revert the unnecessary changes and attempt to submit a pull request.
I had a problem I didn't know how to solve so I had to remove the --KernelSpecManager.whitelist=${EG_KERNEL_WHITELIST}
from the jupyter enterprisegateway initialization because a got the error traitlets.traitlets.TraitError: The 'whitelist' trait of a KernelSpecManager instance must be a set, but a value of class 'str' (i.e. '[r_docker,python_docker,python_tf_docker,scala_docker,spark_r_docker,spark_python_docker,spark_scala_docker]') was specified.
ok - yeah, set-based traitlets can be difficult to get their values configured correctly. Looking at the appropriate files, and comparing them to other systems, I believe each of the items must be single-quoted - all of which are in square brackets. Here are a couple of examples that should work:
https://github.com/jupyter/enterprise_gateway/blob/master/etc/kubernetes/enterprise-gateway.yaml#L141 https://github.com/jupyter/enterprise_gateway/blob/master/etc/docker/enterprise-gateway/start-enterprise-gateway.sh#L28
Were you trying to setup EG_KERNEL_WHITELIST with your own set of values? Or are things getting modified before their actual use?
https://github.com/jupyter/enterprise_gateway/blob/02a7e0a1e59821b521f72b2f5ac56f21619a6cee/etc/docker/docker-compose.yml#L7 This problem might be solved by creating an entrypoint script like this one I use for the same problem on an alpine linux image
That gives the user access to the docker socket
Heres my test Dockerfile
If I misunderstood the problem or in any other way missed something I do apologies.