pangeo-data / helm-chart

Pangeo helm charts
https://pangeo-data.github.io/helm-chart/
21 stars 26 forks source link

WIP: add docker push to deploy script #40

Closed rabernat closed 6 years ago

rabernat commented 6 years ago

The latest helm chart built by chartpress (pangeo-v0.1.1-0234fc7) is unusable because the docker image was never pushed to docker hub. That's because our deploy.sh script does not include the --push flag.

This PR will not work in its current form because we have to add docker authentication. For an example, look at the zero-to-jupyterhub-k8 deploy script: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/ci/deploy.sh

In order to make this work, we need to

We already have some encrypted secrets in this repo in order to publish the chart to github pages. I am out of my depth in terms of all the travis magic required to make this work. Help needed!

cc @jacobtomlinson, @martindurant

(The reason I am working on this on Saturday night is that I need to fix #39 asap so I can get back to the actual science work I am trying to do on pangeo.pydata.org.)

rabernat commented 6 years ago

For now I have manually built and pushed that notebook image: https://hub.docker.com/r/pangeo/notebook/tags/

rabernat commented 6 years ago

I helm upgraded pangeo.pydata.org with the latest helm chart (pangeo-v0.1.1-85dc5c9). In order to make this work, I had to manually build the notebook docker image and push it to dockerhub with the proper commit tag that chartpress expects.

rabernat commented 6 years ago

@jacobtomlinson, I got an error message in the travis build:

$ export DOCKER_PASSWORD=[secure]
The previous command failed, possibly due to a malformed secure environment variable.
Please be sure to escape special characters such as ' ' and '$'.
For more information, see https://docs.travis-ci.com/user/encryption-keys.

Any ideas?

jacobtomlinson commented 6 years ago

That's unexpected. I've updated the password to escape the special characters in it. Let's try again now.

rabernat commented 6 years ago

Yes, it worked! I will merge now and then try this out with a new update to the docker image.

We need to re-fix https://github.com/pangeo-data/pangeo/issues/257, so this will provide a good test case.

jacobtomlinson commented 6 years ago

Awesome! Are you happy that it worked ok?

rabernat commented 6 years ago

The docker tag was successfully pushed after merging #47! https://hub.docker.com/r/pangeo/notebook/tags/

And this tag was placed automatically in the latest chart (2bd2369).

I will now try to deploy that chart.

rabernat commented 6 years ago

I ran

helm repo update
helm upgrade --force --recreate-pods jupyter pangeo/pangeo --version=0.1.1-2bd2369 -f actual-secret-config.yaml  -f jupyter-config.yaml

Now I am getting a "404: Not Found" on my notebook.

It's never easy! 😭

rabernat commented 6 years ago

So it looks like jupyter notebook is working (user/rabernat/tree) but not jupyterlab (user/rabernat/lab). I feel like we have already dealt with a similar issue once before, but I can't find it.

rabernat commented 6 years ago

I rolled back the cluster.

Here is the log from my notebook in the broken deployment (2bd2369)

$ kubectl logs jupyter-rabernat -n pangeo
+ echo 'Copy files from pre-load directory into home'
+ cp --update -r -v /pre-home/. /home/jovyan
Copy files from pre-load directory into home
'/pre-home/./config.yaml' -> '/home/jovyan/./config.yaml'
'/pre-home/./worker-template.yaml' -> '/home/jovyan/./worker-template.yaml'
+ '[' -z '' ']'
+ export EXAMPLES_GIT_URL=https://github.com/pangeo-data/pangeo-example-notebooks
+ EXAMPLES_GIT_URL=https://github.com/pangeo-data/pangeo-example-notebooks
+ '[' '!' -d examples ']'
+ cd examples
+ git remote set-url origin https://github.com/pangeo-data/pangeo-example-notebooks
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ git fetch origin
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ git reset --hard origin/master
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ git merge --strategy-option=theirs origin/master
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ '[' '!' -f DONT_SAVE_ANYTHING_HERE.md ']'
+ echo 'Files in this directory should be treated as read-only'
+ cd ..
+ mkdir -p work
+ '[' -e /opt/app/environment.yml ']'
+ echo 'no environment.yml'
+ '[' '' ']'
+ '[' '' ']'
+ '[' pangeo-data ']'
no environment.yml
+ echo 'Mounting pangeo-data to /gcs'
+ /opt/conda/bin/gcsfuse pangeo-data /gcs --background
Mounting pangeo-data to /gcs
/usr/bin/prepare.sh: line 44: /opt/conda/bin/gcsfuse: Permission denied
+ start-singleuser.sh '--ip="0.0.0.0"' --port=8888 '--NotebookApp.default_url="/lab"'
/usr/local/bin/start-singleuser.sh: ignoring /usr/local/bin/start-notebook.d/*

Container must be run with group "root" to update passwd file
Executing the command: jupyterhub-singleuser --ip="0.0.0.0" --port=8888 --NotebookApp.default_url="/lab"
[W 2018-07-18 11:04:40.912 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2018-07-18 11:04:42.598 SingleUserNotebookApp manager:40] [nb_conda_kernels] enabled, 3 kernels found
[I 2018-07-18 11:04:42.830 SingleUserNotebookApp singleuser:365] Starting jupyterhub-singleuser server version 0.8.1
[I 2018-07-18 11:04:43.463 SingleUserNotebookApp log:122] 302 GET /user/rabernat/ β†’ /user/rabernat/lab? (@10.23.154.8) 0.82ms
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] Serving notebooks from local directory: /home/jovyan
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] 0 active kernels
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] The Jupyter Notebook is running at:
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] http://jupyter-rabernat:8888/user/rabernat/
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1620] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2018-07-18 11:04:46.976 SingleUserNotebookApp log:122] 302 GET /user/rabernat/?redirects=1 β†’ /user/rabernat/lab?redirects=1 (@10.128.0.3) 0.87ms
[W 2018-07-18 11:04:47.323 SingleUserNotebookApp log:122] 404 GET /user/rabernat/lab?redirects=1 (@10.128.0.3) 32.67ms
[I 2018-07-18 11:05:00.215 SingleUserNotebookApp log:122] 302 GET /user/rabernat/ β†’ /user/rabernat/lab? (@10.128.0.3) 0.81ms
[W 2018-07-18 11:05:00.371 SingleUserNotebookApp log:122] 404 GET /user/rabernat/lab? (@10.128.0.3) 1.96ms

Here is the same log from the prior working image (85dc5c9)


Copy files from pre-load directory into home
no environment.yml
Mounting pangeo-data to /gcs
+ echo 'Copy files from pre-load directory into home'
+ cp --update -r -v /pre-home/. /home/jovyan
+ '[' -e /opt/app/environment.yml ']'
+ echo 'no environment.yml'
+ '[' '' ']'
+ '[' '' ']'
+ '[' pangeo-data ']'
+ echo 'Mounting pangeo-data to /gcs'
+ /opt/conda/bin/gcsfuse pangeo-data /gcs --background
+ start-singleuser.sh '--ip="0.0.0.0"' --port=8888 '--NotebookApp.default_url="/lab"'
/usr/local/bin/start-singleuser.sh: ignoring /usr/local/bin/start-notebook.d/*

Container must be run with group "root" to update passwd file
Executing the command: jupyterhub-singleuser --ip="0.0.0.0" --port=8888 --NotebookApp.default_url="/lab"
[W 2018-07-18 11:26:06.828 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2018-07-18 11:26:08.511 SingleUserNotebookApp manager:40] [nb_conda_kernels] enabled, 3 kernels found
[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:54] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2018-07-18 11:26:10.660 SingleUserNotebookApp handlers:73] [nb_anacondacloud] enabled
[I 2018-07-18 11:26:10.664 SingleUserNotebookApp handlers:292] [nb_conda] enabled
[I 2018-07-18 11:26:10.708 SingleUserNotebookApp __init__:35] βœ“ nbpresent HTML export ENABLED
[W 2018-07-18 11:26:10.709 SingleUserNotebookApp __init__:43] βœ— nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'
[I 2018-07-18 11:26:10.712 SingleUserNotebookApp singleuser:365] Starting jupyterhub-singleuser server version 0.8.1
[I 2018-07-18 11:26:10.717 SingleUserNotebookApp log:122] 302 GET /user/rabernat/ β†’ /user/rabernat/lab? (@10.23.154.11) 1.07ms
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] Serving notebooks from local directory: /home/jovyan
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] 0 active kernels
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] The Jupyter Notebook is running at:
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] http://jupyter-rabernat:8888/user/rabernat/
[I 2018-07-18 11:26:10.725 SingleUserNotebookApp notebookapp:1620] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2018-07-18 11:26:10.883 SingleUserNotebookApp log:122] 302 GET /user/rabernat/?redirects=1 β†’ /user/rabernat/lab?redirects=1 (@10.128.0.3) 0.74ms
[I 2018-07-18 11:26:11.144 SingleUserNotebookApp log:122] 302 GET /user/rabernat/lab?redirects=1 β†’ /hub/api/oauth2/authorize?client_id=user-rabernat&redirect_uri=%2Fuser%2Frabernat%2Foauth_callback&response_type=code&state=eyJ1dWlkIjogIjhlMDJhZDU3Y2I3OTQ1MjFhYWM4ZDhiMGI5YzZmZjBhIiwgIm5leHRfdXJsIjogIi91c2VyL3JhYmVybmF0L2xhYj9yZWRpcmVjdHM9MSJ9 (@10.128.0.6) 2.80ms
[I 2018-07-18 11:26:11.689 SingleUserNotebookApp auth:818] Logged-in user {'kind': 'user', 'name': 'rabernat', 'admin': True, 'groups': [], 'server': '/user/rabernat/', 'pending': None, 'last_activity': '2018-07-18T11:26:08.944813'}
[I 2018-07-18 11:26:11.690 SingleUserNotebookApp log:122] 302 GET /user/rabernat/oauth_callback?code=cab09865-86f2-47d7-8607-a40ba1b46ac0&state=eyJ1dWlkIjogIjhlMDJhZDU3Y2I3OTQ1MjFhYWM4ZDhiMGI5YzZmZjBhIiwgIm5leHRfdXJsIjogIi91c2VyL3JhYmVybmF0L2xhYj9yZWRpcmVjdHM9MSJ9 β†’ /user/rabernat/lab?redirects=1 (@10.128.0.6) 81.17ms
[I 2018-07-18 11:26:12.079 SingleUserNotebookApp log:122] 200 GET /user/rabernat/lab?redirects=1 (rabernat@10.128.0.6) 20.55ms
[I 2018-07-18 11:26:13.827 SingleUserNotebookApp log:122] 200 GET /user/rabernat/api/kernelspecs?1531913173740 (rabernat@10.128.0.7) 2.74ms
[I 2018-07-18 11:26:13.966 SingleUserNotebookApp log:122] 200 GET /user/rabernat/api/terminals?1531913173741 (rabernat@10.23.154.1) 1.07ms
rabernat commented 6 years ago

The key difference is that these lines are present in the working image but missing from the more recent (broken) image:

[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:54] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2018-07-18 11:26:10.660 SingleUserNotebookApp handlers:73] [nb_anacondacloud] enabled
[I 2018-07-18 11:26:10.664 SingleUserNotebookApp handlers:292] [nb_conda] enabled
[I 2018-07-18 11:26:10.708 SingleUserNotebookApp __init__:35] βœ“ nbpresent HTML export ENABLED
[W 2018-07-18 11:26:10.709 SingleUserNotebookApp __init__:43] βœ— nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'

I am having serious deja-vu. We have already fixed this once.