Closed oderwat closed 2 weeks ago
hey Hans, this sounds a reasonable tool to add to the Dockerfile -- as gonb usually is used with something else folks are using it with, and ssh
is often a tool to connect to many types of "something else".
Just to clarify:
openssh-client
package, right ? (and not add a ssh server in the docker)RUN apt install openssh-client
), and create your own free account in Dockerhub, to upload your version. With the extra maintenance cost of having to re-run it every time there is a new GoNB release. That is not ideal for you, right ? With ssh available one could even write a helper that setups up a private repository from a notebook. And you could set up port forwardings or fetch files. I think it could also be benefitting if rsync curl and maybe also wget are available.
Another possibility would be to add jovyan
to the sudoers (without password). But this would promote the user to satanic levels :)
Actually, wget
is already included. And it makes sense to add ssh
, curl
and rsync
-- anything that helps one to use the docker as is.
Now about adding jovyan to /etc/sudoers
: I'm not against it, since the docker always run in a container, which is presumably is sandboxed. I mean, escalating from jovyan to root doesn't gain an attacker much in terms of how it can influence the world outside the container. Or am I wrong ? I have the feeling not giving jovyan
sudo powers is just an annoyance to authentic users, and doesn't hinders attackers in any way. WDYT ?
Let me put together a PR.
I also think that you can add jovyan
but then (at last) a user can destroy the image if not careful. But I don't care as it can easily be rebuild.
I was thinking that when using Google Colab (colab.research.google) I was always able to !apt install ...
stuff -- and I recall that was important.
I just checked in Colab and it runs in root by default.
So I'll follow suit and add jovyan
to /etc/sudoers
.
Yes, the original image is not destroyed in Docker, when it runs the image in a container, it forks it for the container (I think).
Sry, I think I won't have the time to rebuild/test/deploy new docker tonight. But first thing tomorrow.
What do you think of PR #140 ?
I compromised by giving sudo
privileges only to apt update
and apt install *
, to allow users to install arbitrary official packages (they are not able to change the apt sources presumably).
Something I was considering is if someone wants to include a library that users CGO, they will need to install gcc
in the docker. But I didn't want to install it by default.
LGTM
One more thing: What if you add the possibility to execute a script when the container starts?
I think this could be done by adding an entrypoint.sh
script that could check for a file like /autostart.sh
and if it is available to run it will be run before the actual tini
call.
This way one could add more stuff to the container on startup (I would install some of our internal tooling for example). This could also be used to adjust the UID/GID of the user or do the private repo setup and other stuff.
And the script would be mounted from the host mounted directory ? Should it be run as root ?
And the script would be mounted from the host mounted directory ? Should it be run as root ?
Yes, that is what I imagine.
Took a little fiddling around but pls check it out:
README.md
file.janpfeifer/gonb_jupyterlab:v0.10.5-20241015
with the autostart.sh
support.Would you double check it works for you ? I tested here, and it seems to be working ... but let me make sure it works for your use case.
cheers
I will check that out asap.
I just tried the new docker image. At first, I ran into problems because I had the container run with user: 1000:100
to have the correct access rights for the volumes.
I think that is not really needed though. But one needs to watch out that all shared directories and files which docker (or the startup script) create have working access rights for jovyan
(see below).
The second difficulty arises when using autostart.sh
with go install ...
. One will run into a failure when you later try to import some stuff from inside the notebook. Go complains about access violations because of GOMODCACHE
is partially created by root with (umask 022
) in that case.
There are different work around solutions I considered:
umask 000
in the autostart.sh
jovyan
(after using su -l joyvan
)In tested all three and ended up updating the access rights (other stuff failed on me)
This is my latest autostart.sh
. It took some time to get the locale stuff working.
echo "Configuring system..."
apt-get update
# I want vim
apt-get install -y vim
# set German timezone (so time.Now() returns German time)
apt-get install -y tzdata
ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime
# some locale magic to make "date" answer with German format
echo 'de_DE.UTF-8 UTF-8' >> /etc/locale.gen
locale-gen
echo 'LC_ALL="de_DE.utf8"' > /etc/default/locale
export LC_ALL="de_DE.UTF-8"
dpkg-reconfigure locales
# check if it worked
date
# installing Go tools
go install github.com/nats-io/natscli/nats@latest
chown -R jovyan:users /opt/go
Yes, the user that will run Jupyter is configured as $NB_USER
(the variables is exported) == "jovyan" -- this is part of the JupyterLab docker (jupyter/base-notebook
) on which this one is based.
Hmm, that situation in Go is wrong. What happens is that I set the $GOPATH
in the Dockerfile
. The root user should have its own $GOPATH
-- so what is installed by root is owned by root, and what is installed by . I'll fix that.
Another problem is that I have to manually export stuff when running as "jovyan". The line used to run JupyterLab is:
su --preserve-environment $NB_USER -c "export PATH=${PATH} ; jupyter lab"
I'll try to create a .profile
(or .bashrc
) for "jovyan" that sets all the variables, so the standard su -l jovyan
will do the trick.
Ok, I think I got this working.
Here how your updated autostart.sh
should look like -- I hope you don't mind, I added it as an example in the documentation:
#!/bin/bash
echo "Configuring system..."
#apt-get update
# I want vim
#apt-get install -y vim
# set German timezone (so time.Now() returns German time)
apt-get install -y tzdata
ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime
# some locale magic to make "date" answer with German format
echo 'de_DE.UTF-8 UTF-8' >> /etc/locale.gen
locale-gen
echo 'LC_ALL="de_DE.utf8"' > /etc/default/locale
export LC_ALL="de_DE.UTF-8"
dpkg-reconfigure locales
# check if it worked
date
# Installing Go tools for $NB_USER
su -l "$NB_USER" -c "go install github.com/nats-io/natscli/nats@latest"
This is the result in the notebook:
Latest version uploaded again to janpfeifer/gonb_jupyterlab:v0.10.5-20241015
if you want to try it out ?
Btw, thanks for checking it.
I tried it and it works excellent!
Nice, closing this one. After the other features are in I'll cut a new release.
A quick question: Wouldn't it be better to add tzdata
in the image already? Go uses the timezone data of the system afaik:
time.Local, err = time.LoadLocation("US/Pacific") // <- this only works with tzdata installed
if err != nil {
fmt.Printf("Error: %v\n",err)
}
now = time.Now()
Oh, I was not aware of it. Yes, let me add it to the Dockerfile
.
Done in #142
Included in the v0.10.6 release. Docker also already available.
Hi @janpfeifer,
TL;DR: Would you consider adding OpenSSH to the Docker image to simplify the setup for private repositories?
Full Story:
I've been using the original gonb Docker image to create a 'gonb'-based hub with shared code and services for experimenting, sharing, and explaining code among colleagues.
This setup is essentially a Docker Compose configuration with multiple gonb containers (one for each user) that share some directories containing notebooks. It runs on one of our high-performance servers and also provides several services (like MariaDB, PostgreSQL, MSSQL, NATS, and Clickhouse) which we use for development.
I even wrote a crude REST API mirroring tool. Using this, I can share API services from any machine to another using NATS. In this case, I use it to share AI services from my Windows machine (running Ollama, SD-WebUI, Coqui TTS) with the gonb hub. To accomplish this, I run a server (written in Go) inside WSL 2 on the Windows RTX 3090 Ti machine and a Docker container for the API endpoint in the hub. This works surprisingly well given the minimal effort I've invested so far. It even runs Jupyter, although it's missing web sockets and currently doesn't support requests larger than the NATS message limit (2 MB in this case).
To integrate our private repositories, I created a shared OpenSSH token and added it as to a special gonb user in GITEA. I then used shared paths to add the necessary SSH and Git configuration files for private repo access into the original container. However, I encountered a problem: there is no
ssh
in the gonb container, and addingssh
from the host doesn't work due to glibc incompatibilities. I found a statically linked OpenSSH binary and currently add thessh
command from there into the container.I would prefer if the original gonb image had OpenSSH installed. Perhaps it could even set up private repository access when provided with certain environment variables. It could create the necessary files and run a key scan for the specified hosts.
Thank you for considering this suggestion!