Closed klindsay28 closed 4 years ago
@andersy005, can you please help!
What's the bokeh version? it should not be 2.0.0
. 2.1.1
is working fine for me right now.
In my environment, bokeh version = 2.2.0
.
In my environment, bokeh version = 2.2.0.
What versiosn of dask, distributed and dask-jobqueue are you running?
output from conda list includes
dask 2.24.0 py_0 conda-forge
dask-jobqueue 0.7.1 py_0 conda-forge
distributed 2.24.0 py37hc8dfbb8_0 conda-forge
The versions look okay... Try running this code snippet and let me know what you get
import subprocess
dashboard_port = str(client.scheduler_info()['services']['dashboard'])
p = subprocess.Popen('ss -nlput'.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout = p.communicate()[0].decode()
[each for each in stdout.splitlines() if dashboard_port in each]
['tcp LISTEN 0 128 *:8787 *:* users:(("python",pid=81685,fd=47))',
'tcp LISTEN 0 128 [::]:8787 [::]:* users:(("python",pid=81685,fd=51))']
What's the bokeh version? it should not be 2.0.0. 2.1.1 is working fine for me right now.
@dcherian, are you using the JupyterHub? I am asking because I am using the latest versions of dask/distributed/dask-jobqueue/bokeh and everything works fine when I am not using the JupyterHub, but the moment I run the notebook on the Hub, I get the same error @klindsay28 is getting.
ah no. I don't use the hub
@klindsay28, my speculation is that the Hub is the culprit here. The issue stems from the fact there's version mismatch between your environment (hires-marbl
) and the environment the Jupyter Server is coming from (the jupyter server is owned by the Hub).
The long term solution is to have the JupyterHub environment upgraded/updated (cc @jbaksta). The short term solution is to pin the dask and distributed versions to 2.14
and bokeh to 1.4
.
I'm getting the same error outside of jupyterhub
I'm getting the same error outside of jupyterhub
Is the jupyter lab running from your base environment or hires-marbl
environment ?
I'm a bit confused, I'm not sure if something is going through jupyterhub.
I'm using a modified version of jlab-dav to run jupyterlab in a SLURM job on casper
and am using ssh port forwarding to view jupyterlab through localhost in my browser.
However, after instantiating the cluster, the value of cluster.dashboard_link
is https://jupyterhub.ucar.edu/dav/user/klindsay/proxy/43104/status
.
The presence of jupyterhub
in this makes me think I'm somehow going through jupyterhub.
But I don't understand why that would be.
I had not activated any conda environment. I'm going to try again after activating base
.
However, after instantiating the cluster, the value of cluster.dashboard_link is https://jupyterhub.ucar.edu/dav/user/klindsay/proxy/43104/status. The presence of jupyterhub in this makes me think I'm somehow going through jupyterhub. But I don't understand why that would be.
Ooooh... This is ncar-jobqueue
issue. Under the hood, ncar-jobqueue tries to determine which machine you are running on and it sets the dashboard url accordingly (assuming you are running on the JupyterHub). When you are not using the JupyterHub, you need to modify the dashboard link
import dask
cluster = ncar_jobqueue.NCARCluster()
dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})
client = Client(cluster)
client
I had not activated any conda environment. I'm going to try again after activating
base
.
I recommend activating the hires-marbl
environment, and launching jupyter lab.
I still get the error when I run jlab-dav
after activating the hires-marbl
environment (also after activating the base
environment).
I applied PR #25, including running environments/postBuild
, and I still get the error.
FYI, it updated dask-distributed
to 2.25.0
.
So strange.... Can you point me to the location (on GLADE) of the notebook you are running?
/glade/work/klindsay/analysis/HiRes-CESM-analysis/notebooks/Untitled2.ipynb
/glade/work/klindsay/analysis/HiRes-CESM-analysis/notebooks/Untitled2.ipynb
Thanks!
I just launched jupyter lab using your environment
$ conda activate /glade/work/klindsay/miniconda3/envs/hires-marbl
(hires-marbl)
abanihi at casper26 in ~
$ which python
/glade/work/klindsay/miniconda3/envs/hires-marbl/bin/python
(hires-marbl)
abanihi at casper26 in ~
$ jlab-casper
ssh -N -L 8777:casper26:8777 abanihi@casper26.ucar.edu
[I 16:16:45.392 LabApp] [jupyter_nbextensions_configurator] enabled 0.4.1
[I 16:16:45.961 LabApp] JupyterLab extension loaded from /glade/work/klindsay/miniconda3/envs/hires-marbl/lib/python3.7/site-packages/jupyterlab
[I 16:16:45.961 LabApp] JupyterLab application directory is /glade/work/klindsay/miniconda3/envs/hires-marbl/share/jupyter/lab
[I 16:16:45.965 LabApp] Serving notebooks from local directory: /glade/u/home/abanihi
[I 16:16:45.965 LabApp] Jupyter Notebook 6.1.3 is running at:
[I 16:16:45.965 LabApp] http://casper26:8777/
and everything seems to be working fine on my end:
I still get the error when I run jlab-dav after activating the hires-marbl environment (also after activating the base environment).
What's the content of jlad-dav
script?
jlab-dav
is /glade/u/home/klindsay/bin/jlab-dav
Please note that I get the error message after clicking on an element of the dashboard, such as showing workers or graph.
Please note that I get the error message after clicking on an element of the dashboard, such as showing workers or graph.
I can confirm that the dashboard works when I open the widgets or click on an element:
It appears that the jupyter lab is being launched from base
in the jlab-dav
script regardless of the environment has activated:
# 4. open browser: http://localhost:8888
conda activate base
So, I recommend updating your base environment (it has somewhat outdated packages which could be the culprit)
$ conda activate base
$ conda update --all -c conda-forge
and re-running jlab-dav
.
I have a working dashboard via JHub with
dask 2.3.0
distributed 2.3.2
bokeh 1.4.0
@andersy005, do you recommend pinning any of these versions? It's is critical that we resolve this ASAP to get the codes running again. We could revisit later with more comprehensive testing.
@andersy005, do you recommend pinning any of these versions? It's is critical that we resolve this ASAP to get the codes running again. We could revisit later with more comprehensive testing.
Yeah... Let's pin the versions for the time being. If the user is not using the Hub, the user should launch the jupyter lab from the hires-marbl
environment instead of base
otherwise the version pinning is likely going to break due to version mismatches of some packages in base
and hires-marbl
.
I will pin the versions in #25
@matt-long, note that codes do run, I just can't visualize how dask is operating.
So there are 2 environments at play, base
and hires-marbl
. Matt previously advised me to launch jupyterlab from a pared down base environment, and select the more complete environment for the notebook. I don't recall why he advised to do that, or if that advice still holds.
How critical is it to update the jupyterhub instance at this point? I plan on doing it in the near future, but I'll probably attempt to coordinate with a systems outage.
fixed by #25
I'm adding dask via an ncar-jobqueue cluster in PR #21. I'd like to visualize the worker's activity, so that I can tell if what I'm adding is helpful to computational performance of the diagnostics. However, when I attempt to use the dask dashboard, I get the following error
I've found similar messages in github issues, like https://github.com/jupyterhub/jupyter-server-proxy/issues/179, but this is purported to be fixed.
I'm inferring that the conda environment has pulled together versions of packages that don't play well together. This portion of the software stack is a mystery to me, so I don't see how to proceed towards fixing this.