pangeo-data / helm-chart

Pangeo helm charts
https://pangeo-data.github.io/helm-chart/
21 stars 26 forks source link

jupyterlab extension doesn't load in latest docker image #48

Closed rabernat closed 6 years ago

rabernat commented 6 years ago

In our latest docker image, jupyterlab doesn't work. We discovered this after merging #47 and deploying the latest helm chart.

The key difference is that these lines are present in the working image (85dc5c9) but missing from the more recent image (2bd2369):

[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:54] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2018-07-18 11:26:10.660 SingleUserNotebookApp handlers:73] [nb_anacondacloud] enabled
[I 2018-07-18 11:26:10.664 SingleUserNotebookApp handlers:292] [nb_conda] enabled
[I 2018-07-18 11:26:10.708 SingleUserNotebookApp __init__:35] ✓ nbpresent HTML export ENABLED
[W 2018-07-18 11:26:10.709 SingleUserNotebookApp __init__:43] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'

I am having serious deja-vu. We dealt with a very similar issue in https://github.com/pangeo-data/pangeo/pull/261#issuecomment-390083609. There are a couple of different issues mixed together there, but ultimately we fixed it issue by pinning jupyterlab_launcher=0.10.5

What is frustrating here is that we didn't change anything jupyter-related in the latest docker image, yet it has broken again in the same way.

cc @yuvipanda, @mrocklin, @jhamman

mrocklin commented 6 years ago

Did we have jupyter-related packages pinned in our Dockerfile? Perhaps something changed upstream?

On Wed, Jul 18, 2018 at 9:00 AM, Ryan Abernathey notifications@github.com wrote:

In our latest docker image, jupyterlab doesn't work. We discovered this after merging #47 https://github.com/pangeo-data/helm-chart/pull/47 and deploying the latest helm chart.

The key difference is that these lines are present in the working image ( 85dc5c9) but missing from the more recent image (2bd2369):

[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab [I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:54] JupyterLab application directory is /opt/conda/share/jupyter/lab [I 2018-07-18 11:26:10.660 SingleUserNotebookApp handlers:73] [nb_anacondacloud] enabled [I 2018-07-18 11:26:10.664 SingleUserNotebookApp handlers:292] [nb_conda] enabled [I 2018-07-18 11:26:10.708 SingleUserNotebookApp init:35] ✓ nbpresent HTML export ENABLED [W 2018-07-18 11:26:10.709 SingleUserNotebookApp init:43] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'

I am having serious deja-vu. We dealt with a very similar issue in pangeo-data/pangeo#261 (comment) https://github.com/pangeo-data/pangeo/pull/261#issuecomment-390083609. There are a couple of different issues mixed together there, but ultimately we fixed it issue by pinning jupyterlab_launcher pr=0.10.5

What is frustrating here is that we didn't change anything jupyter-related in the latest docker image, yet it has broken again in the same way.

cc @yuvipanda https://github.com/yuvipanda, @mrocklin https://github.com/mrocklin, @jhamman https://github.com/jhamman

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/helm-chart/issues/48, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszPGqDsCjNs_FvqEC4RSiFtIY31USks5uHzFtgaJpZM4VUjZ6 .

jgerardsimcock commented 6 years ago

@rabernat on line 76. The jupyterlab/hub-extension is commented-out. I don't know if this matters.

Also, is there a staging cluster where you can launch a deploy, run some basic test and then deploy to pangeo.pydata.org?

rabernat commented 6 years ago

The jupyterlab/hub-extension is commented-out. I don't know if this matters.

@jgerardsimcock - good catch...it sure seems like this should be relevant. However, as I understand it, this was commented out already in previous images and it seemed to work.

Also, is there a staging cluster where you can launch a deploy, run some basic test and then deploy to pangeo.pydata.org?

Yes there is...I should have deployed to the test cluster, but I was just being lazy.

rabernat commented 6 years ago

Did we have jupyter-related packages pinned in our Dockerfile?

Yes, for the most part.

Perhaps something changed upstream?

I'm going to first try @jgerardsimcock's suggestion.

rabernat commented 6 years ago

In https://github.com/pangeo-data/helm-chart/pull/49 I tried uncommenting line 76:

RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager \
                                   @jupyterlab/hub-extension

I deployed this to a test cluster...no luck 😞(@mrocklin - git blame tells me you were the one who originally commented this line out. Do you have any recollection of why?)

I guess the next step would be to investigate an upstream change.

I really hate this part of my job.

mrocklin commented 6 years ago

@rabernat can you verify that you have nodejs installed in the image?

On Wed, Jul 18, 2018 at 6:58 PM, Ryan Abernathey notifications@github.com wrote:

In #49 https://github.com/pangeo-data/helm-chart/pull/49 I tried uncommenting line 76:

RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager \ @jupyterlab/hub-extension

I deployed this to a test cluster...no luck 😞(@mrocklin https://github.com/mrocklin - git blame tells me you were the one who originally commented this line out. Do you have any recollection of why?)

I guess the next step would be to investigate an upstream change.

I really hate this part of my job.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/helm-chart/issues/48#issuecomment-406099718, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszPMIPatmJ-H7IzhfcmCpJ9ixopBOks5uH71-gaJpZM4VUjZ6 .

rabernat commented 6 years ago

I will try to verify that. The changes from the earlier, working notebook images have been minimal. Is there a specific hunch you have in mind here?

In the meantime, you can see the whole docker build log here: https://travis-ci.org/pangeo-data/helm-chart/builds/405574851

mrocklin commented 6 years ago

I noticed that extensions stopped building in one of my environments recently. It was solved by installing nodejs, which I guess had been dropped somehow from an upstream dependency. Just a hunch though.

On Wed, Jul 18, 2018 at 7:06 PM, Ryan Abernathey notifications@github.com wrote:

I will try to verify that. The changes from the earlier, working notebook images have been minimal. Is there a specific hunch you have in mind here?

In the meantime, you can see the whole docker build log here: https://travis-ci.org/pangeo-data/helm-chart/builds/405574851

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/helm-chart/issues/48#issuecomment-406101272, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszGcVk6H3ObfTnqrOzu7bs78tiDLqks5uH79dgaJpZM4VUjZ6 .

rabernat commented 6 years ago

The build script has this line:

> node /opt/conda/lib/python3.6/site-packages/jupyterlab/staging/yarn.js run build:prod

Does that mean node is installed?

There are no errors in the build log that suggest the extensions are not installing properly.

rabernat commented 6 years ago

There are these warnings...maybe relevant?

warning "@jupyterlab/json-extension > react-json-tree@0.10.9" has incorrect peer dependency "react@^15.0.0".
warning "@jupyterlab/json-extension > react-json-tree@0.10.9" has incorrect peer dependency "react-dom@^15.0.0".
warning "@jupyterlab/vdom-extension > @nteract/transform-vdom@1.1.1" has incorrect peer dependency "react@^15.6.1".
mrocklin commented 6 years ago

I don't know. I recommend asking upstream.

@jasongrout sorry to bother you with this. Long term how should the Pangeo community triage situations like this? Is there someone we should ping within JLab here, should we raise a separate JLab issue, or should we raise a stack overflow question or something similar.

rabernat commented 6 years ago

I will provide more details in case some jupyter people stop by to help us out.

Here is the log from my notebook pod in a recent broken image (2bd2369)

$ kubectl logs jupyter-rabernat -n pangeo
+ echo 'Copy files from pre-load directory into home'
+ cp --update -r -v /pre-home/. /home/jovyan
Copy files from pre-load directory into home
'/pre-home/./config.yaml' -> '/home/jovyan/./config.yaml'
'/pre-home/./worker-template.yaml' -> '/home/jovyan/./worker-template.yaml'
+ '[' -z '' ']'
+ export EXAMPLES_GIT_URL=https://github.com/pangeo-data/pangeo-example-notebooks
+ EXAMPLES_GIT_URL=https://github.com/pangeo-data/pangeo-example-notebooks
+ '[' '!' -d examples ']'
+ cd examples
+ git remote set-url origin https://github.com/pangeo-data/pangeo-example-notebooks
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ git fetch origin
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ git reset --hard origin/master
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ git merge --strategy-option=theirs origin/master
fatal: not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
+ '[' '!' -f DONT_SAVE_ANYTHING_HERE.md ']'
+ echo 'Files in this directory should be treated as read-only'
+ cd ..
+ mkdir -p work
+ '[' -e /opt/app/environment.yml ']'
+ echo 'no environment.yml'
+ '[' '' ']'
+ '[' '' ']'
+ '[' pangeo-data ']'
no environment.yml
+ echo 'Mounting pangeo-data to /gcs'
+ /opt/conda/bin/gcsfuse pangeo-data /gcs --background
Mounting pangeo-data to /gcs
/usr/bin/prepare.sh: line 44: /opt/conda/bin/gcsfuse: Permission denied
+ start-singleuser.sh '--ip="0.0.0.0"' --port=8888 '--NotebookApp.default_url="/lab"'
/usr/local/bin/start-singleuser.sh: ignoring /usr/local/bin/start-notebook.d/*

Container must be run with group "root" to update passwd file
Executing the command: jupyterhub-singleuser --ip="0.0.0.0" --port=8888 --NotebookApp.default_url="/lab"
[W 2018-07-18 11:04:40.912 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2018-07-18 11:04:42.598 SingleUserNotebookApp manager:40] [nb_conda_kernels] enabled, 3 kernels found
[I 2018-07-18 11:04:42.830 SingleUserNotebookApp singleuser:365] Starting jupyterhub-singleuser server version 0.8.1
[I 2018-07-18 11:04:43.463 SingleUserNotebookApp log:122] 302 GET /user/rabernat/ → /user/rabernat/lab? (@10.23.154.8) 0.82ms
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] Serving notebooks from local directory: /home/jovyan
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] 0 active kernels
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] The Jupyter Notebook is running at:
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1619] http://jupyter-rabernat:8888/user/rabernat/
[I 2018-07-18 11:04:43.465 SingleUserNotebookApp notebookapp:1620] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2018-07-18 11:04:46.976 SingleUserNotebookApp log:122] 302 GET /user/rabernat/?redirects=1 → /user/rabernat/lab?redirects=1 (@10.128.0.3) 0.87ms
[W 2018-07-18 11:04:47.323 SingleUserNotebookApp log:122] 404 GET /user/rabernat/lab?redirects=1 (@10.128.0.3) 32.67ms
[I 2018-07-18 11:05:00.215 SingleUserNotebookApp log:122] 302 GET /user/rabernat/ → /user/rabernat/lab? (@10.128.0.3) 0.81ms
[W 2018-07-18 11:05:00.371 SingleUserNotebookApp log:122] 404 GET /user/rabernat/lab? (@10.128.0.3) 1.96ms

Here is the same log from the prior working image (85dc5c9)

Copy files from pre-load directory into home
no environment.yml
Mounting pangeo-data to /gcs
+ echo 'Copy files from pre-load directory into home'
+ cp --update -r -v /pre-home/. /home/jovyan
+ '[' -e /opt/app/environment.yml ']'
+ echo 'no environment.yml'
+ '[' '' ']'
+ '[' '' ']'
+ '[' pangeo-data ']'
+ echo 'Mounting pangeo-data to /gcs'
+ /opt/conda/bin/gcsfuse pangeo-data /gcs --background
+ start-singleuser.sh '--ip="0.0.0.0"' --port=8888 '--NotebookApp.default_url="/lab"'
/usr/local/bin/start-singleuser.sh: ignoring /usr/local/bin/start-notebook.d/*

Container must be run with group "root" to update passwd file
Executing the command: jupyterhub-singleuser --ip="0.0.0.0" --port=8888 --NotebookApp.default_url="/lab"
[W 2018-07-18 11:26:06.828 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2018-07-18 11:26:08.511 SingleUserNotebookApp manager:40] [nb_conda_kernels] enabled, 3 kernels found
[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:54] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2018-07-18 11:26:10.660 SingleUserNotebookApp handlers:73] [nb_anacondacloud] enabled
[I 2018-07-18 11:26:10.664 SingleUserNotebookApp handlers:292] [nb_conda] enabled
[I 2018-07-18 11:26:10.708 SingleUserNotebookApp __init__:35] ✓ nbpresent HTML export ENABLED
[W 2018-07-18 11:26:10.709 SingleUserNotebookApp __init__:43] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'
[I 2018-07-18 11:26:10.712 SingleUserNotebookApp singleuser:365] Starting jupyterhub-singleuser server version 0.8.1
[I 2018-07-18 11:26:10.717 SingleUserNotebookApp log:122] 302 GET /user/rabernat/ → /user/rabernat/lab? (@10.23.154.11) 1.07ms
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] Serving notebooks from local directory: /home/jovyan
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] 0 active kernels
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] The Jupyter Notebook is running at:
[I 2018-07-18 11:26:10.724 SingleUserNotebookApp notebookapp:1619] http://jupyter-rabernat:8888/user/rabernat/
[I 2018-07-18 11:26:10.725 SingleUserNotebookApp notebookapp:1620] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2018-07-18 11:26:10.883 SingleUserNotebookApp log:122] 302 GET /user/rabernat/?redirects=1 → /user/rabernat/lab?redirects=1 (@10.128.0.3) 0.74ms
[I 2018-07-18 11:26:11.144 SingleUserNotebookApp log:122] 302 GET /user/rabernat/lab?redirects=1 → /hub/api/oauth2/authorize?client_id=user-rabernat&redirect_uri=%2Fuser%2Frabernat%2Foauth_callback&response_type=code&state=eyJ1dWlkIjogIjhlMDJhZDU3Y2I3OTQ1MjFhYWM4ZDhiMGI5YzZmZjBhIiwgIm5leHRfdXJsIjogIi91c2VyL3JhYmVybmF0L2xhYj9yZWRpcmVjdHM9MSJ9 (@10.128.0.6) 2.80ms
[I 2018-07-18 11:26:11.689 SingleUserNotebookApp auth:818] Logged-in user {'kind': 'user', 'name': 'rabernat', 'admin': True, 'groups': [], 'server': '/user/rabernat/', 'pending': None, 'last_activity': '2018-07-18T11:26:08.944813'}
[I 2018-07-18 11:26:11.690 SingleUserNotebookApp log:122] 302 GET /user/rabernat/oauth_callback?code=cab09865-86f2-47d7-8607-a40ba1b46ac0&state=eyJ1dWlkIjogIjhlMDJhZDU3Y2I3OTQ1MjFhYWM4ZDhiMGI5YzZmZjBhIiwgIm5leHRfdXJsIjogIi91c2VyL3JhYmVybmF0L2xhYj9yZWRpcmVjdHM9MSJ9 → /user/rabernat/lab?redirects=1 (@10.128.0.6) 81.17ms
[I 2018-07-18 11:26:12.079 SingleUserNotebookApp log:122] 200 GET /user/rabernat/lab?redirects=1 (rabernat@10.128.0.6) 20.55ms
[I 2018-07-18 11:26:13.827 SingleUserNotebookApp log:122] 200 GET /user/rabernat/api/kernelspecs?1531913173740 (rabernat@10.128.0.7) 2.74ms
[I 2018-07-18 11:26:13.966 SingleUserNotebookApp log:122] 200 GET /user/rabernat/api/terminals?1531913173741 (rabernat@10.23.154.1) 1.07ms

I am curious to understand what is happening under the hood which makes this line

[I 2018-07-18 11:26:08.763 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab

appear in one but not the other. How does the startup script discover which extensions are installed?

rabernat commented 6 years ago

Just some background about why I care about this.

At some point in the past month, cftime got dropped from pangeo.pydata.org (see https://github.com/pangeo-data/pangeo/issues/257). I need this for my science. I made a one line change to the notebook dockerfile to get it back (#47). Now, due to undetermined causes, jupyterlab is not working any more.

I wish I could just be patient and let this sort itself out, but I had plans for work I wanted to do today.

jasongrout commented 6 years ago

From the missing lines, it seems like the jupyterlab server extension isn't enabled. Is there a way to see what jupyter serverextension list gives? Is JupyterLab installed and enabled?

You're also welcome to open an issue in the JLab repo to get more eyes than mine, or ask on the jupyterlab gitter.

jacobtomlinson commented 6 years ago

I need this for my science.

This definitely highlights what I think is one of the biggest issues in science today. You want to do some science, but instead you end up doing computer science/software engineering. I think it would be useful to record these situations somewhere.

rabernat commented 6 years ago

Is there a way to see what jupyter serverextension list gives?

I ran this from inside the docker image:

$ jupyter serverextension list
config dir: /opt/conda/etc/jupyter
    jupyterlab disabled
    - Validating...
      jupyterlab 0.32.1 OK
    nbserverproxy  enabled 
    - Validating...
      nbserverproxy  OK

Looks like jupyterlab is disabled. But why? How do we enable it?

Anyone else can examine this image by running

docker pull pangeo/notebook:2dd9b30
docker run -it pangeo/notebook:2dd9b30 /bin/bash
jasongrout commented 6 years ago

How do we enable it?

jupyter serverextension enable jupyterlab

Not sure how it got disabled, though. Perhaps somehow there's a config file being propagated (in that /opt/conda/etc/jupyter directory)?

rabernat commented 6 years ago

Ok I think we are getting somewhere:

jovyan@e74be473e3fd:~$ jupyter serverextension enable jupyterlab
Enabling: jupyterlab
- Writing config: /home/jovyan/.jupyter
    - Validating...
      jupyterlab 0.32.1 OK
jovyan@e74be473e3fd:~$ jupyter serverextension list
config dir: /home/jovyan/.jupyter
    jupyterlab  enabled 
    - Validating...
      jupyterlab 0.32.1 OK
config dir: /opt/conda/etc/jupyter
    jupyterlab disabled
    - Validating...
      jupyterlab 0.32.1 OK
    nbserverproxy  enabled 
    - Validating...
      nbserverproxy  OK

There appear to be multiple config directories involved. And jupyterlab is explicitly disabled in one of them.

$ cat /opt/conda/etc/jupyter/jupyter_notebook_config.json 
{
  "NotebookApp": {
    "nbserver_extensions": {
      "jupyterlab": false,
      "nbserverproxy": true
    },
    "kernel_spec_manager_class": "nb_conda_kernels.CondaKernelSpecManager"
  }
rabernat commented 6 years ago

This is currently what we have in the Dockerfile to enable extensions:

https://github.com/pangeo-data/helm-chart/blob/7809325b330449782a3988471ea0591840c108ac/docker-images/notebook/Dockerfile#L75-L79

should we add

jupyter serverextension enable jupyterlab
jasongrout commented 6 years ago

I think we should find out why the serverextension is getting disabled. That shouldn't be happening.

When JupyterLab is uninstalled via conda-forge, it automatically disables itself in sys-prefix: https://github.com/conda-forge/jupyterlab-feedstock/blob/master/recipe/pre-unlink.sh

Is there somewhere jupyterlab is being uninstalled?

rabernat commented 6 years ago

This was fixed by #50, which is now live on pangeo.pydata.org.

I think we should find out why the serverextension is getting disabled. That shouldn't be happening.

I would be happy to try to sort out why it was getting disabled. Perhaps someone knowledgeable could examine the Dockerfile and build log to try to figure this out.

But in the meantime, I am happy with my fix of explicitly enabling the extension.

rabernat commented 6 years ago

p.s. @jasongrout thanks so much for your help, which was crucial for resolving the issue!

jasongrout commented 6 years ago

Note that we've outlined a plan for not disabling jlab on uninstall at https://github.com/conda-forge/jupyterlab-feedstock/issues/111, so perhaps this problem might go away in the future.

jasongrout commented 6 years ago

I notice that you are installing from both the conda defaults and conda-forge channels. I think both channels have a jupyterlab package. Perhaps there are some conflicts between the two?

I usually only install from the conda-forge channel these days to avoid any issues with mismatched C libraries, etc.

jasongrout commented 6 years ago

(I guess this issue can be closed too...)

jasongrout commented 6 years ago

Your build log does show that jupyterlab is getting removed. Here's my theory about what's happening:

  1. The conda-forge jupyterlab 0.32.1 package is getting installed
  2. In your install step, it's now finding the jupyterlab 0.32.1 package on the defaults channel (this is what changed, the defaults channel now has an 0.32.1 version)
  3. It prefers the defaults channel version for jupyterlab 0.32.1, so it uninstalls the conda-forge package, which explicitly disables the server extension in a pre-unlink script like I mentioned above.
  4. The defaults channel jupyterlab does not enable the server extension. One of the restrictions in the anaconda defaults channel is no link/unlink scripts. This means that the disabling in the previous step still is in effect.

With the transition plan outlined in https://github.com/conda-forge/jupyterlab-feedstock/issues/111, the next conda-forge version of jupyterlab should not disable on uninstall, so if I'm right, the issue here goes away.

jasongrout commented 6 years ago

(and in general, to avoid surprises like this, I would pin not just the version number of packages, but also the channel they came from)

mrocklin commented 6 years ago

@jasongrout for context we've chosen to prefer the defaults channel because packages there tend to be a bit slimmer (which becomes important when your docker images get to be a few gigabytes). I've opened a semi-related issue here: https://github.com/jupyterlab/jupyterlab/issues/4930

(this isn't actually related much to the issue at hand here, but was just a good opportunity to bring this up)

On Thu, Jul 19, 2018 at 7:45 AM, Jason Grout notifications@github.com wrote:

(and in general, to avoid surprises like this, I would pin not just the version number of packages, but also the channel they came from)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/helm-chart/issues/48#issuecomment-406248309, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszH5dnVk2Zhu7Or5wuP2PM5OikeTFks5uIHF1gaJpZM4VUjZ6 .

jasongrout commented 6 years ago

Good point. I still suggest pinning the channel for packages where possible to avoid surprises of one channel suddenly taking precedence over another channel, though this really only works if you pin every single dependency as well.

jasongrout commented 6 years ago

I've also observed that sometimes the defaults channel packages are pretty heavy. For example, I think the matplotlib package requires qt, or at least used to.

mrocklin commented 6 years ago

Yeah, I'm sure that there are anecdotal differences both ways. The default compilers or compiler flags generally produced smaller environments overall though. In particular, the software stack used in these images was something like 30%-50% bigger (if memory serves). There has been a lot of work to unify things since I last checked though, so this probably deserves to be reassessed.

On Thu, Jul 19, 2018 at 8:00 AM, Jason Grout notifications@github.com wrote:

I've also observed that sometimes the defaults channel packages are pretty heavy. For example, I think the matplotlib package requires qt, or at least used to?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/helm-chart/issues/48#issuecomment-406251737, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszBs4Ux0hJO9gl8YR7z5ORlAIesJGks5uIHT7gaJpZM4VUjZ6 .

tjcrone commented 6 years ago

One solution to this would be to conda export the base/root env in the docker image, and then maintain this text file as the "pangeo" conda environment in git, and during image builds do a conda create with the file. The only tricky thing would be that docker would likely need to wget/curl the file from GitHub as I don't know how easy it would be for the container to pull from the host. There is probably a way. Anyway, the env yaml file describes the environment fully, including pip installs, channels, versions, etc., and it would be unlikely that these sorts of confusionals would arise. I do not think Jupyterlab extensions would be included, but these can be pinned in the Dockerfile if need be.

Update: The COPY command can be used in the Dockerfile to transfer the env file into the container.

rabernat commented 6 years ago

Thanks for this valuable discussion on how to streamline and optimize the environments in our docker containers.

This task, like many others, is just waiting for an eager volunteer (see #38).

rsignell-usgs commented 6 years ago

I think it's worth pointing out that we have had none of these problems over on pangeo.esipfed.org.

We rely on conda-forge, and make sure we update miniconda to use the lastest conda-forge packages before installing the packages customized for our pangeo users:

https://github.com/rsignell-usgs/helm-chart/blob/conda-forge/docker-images/notebook/Dockerfile