Closed tlvu closed 2 years ago
Houston, I have notebooks failure that is most likely due to the shapely (1.7.1 --> 1.8.0) upgrade:
19:22:46 _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb::Cell 2 _
19:22:46 Notebook cell execution failed
19:22:46 Cell 2: Cell execution caused an exception
19:22:46
19:22:46 Input:
19:22:46 import geopandas as gpd
19:22:46 import hvplot.pandas
19:22:46 gdf = gpd.GeoDataFrame.from_file('/notebook_dir/pavics-homepage/tutorial_data/gaspesie_mrc.geojson')
19:22:46 gdf = gdf.dissolve(by='MUS_NM_MRC')
19:22:46 gdf['region_name'] = gdf.index
19:22:46
19:22:46 # TODO replace with clisops average.average_shape() once it can do a 'skipna'
19:22:46 # mask of valid (non-nan) data cells
19:22:46 data_mask = ds_ens.tx_mean.isel(rcp=0, realization=0).mean(dim=['year','season']).notnull()
19:22:46 # spatial weights of gridcells interesecting each polygon
19:22:46 weight_masks = subset.create_weight_masks(ds_ens, poly=gdf)
19:22:46 def clean_masks(data_mask, masks):
19:22:46 #remove weight values of gridcells that are nan in the actual data. Rescale so total == 1
19:22:46 return (masks * data_mask) / (masks * data_mask).sum(dim=['lat', 'lon'])
19:22:46
19:22:46 weight_masks = clean_masks(data_mask, weight_masks)
19:22:46
19:22:46 # Calculate weighted average for each region
19:22:46 with xr.set_options(keep_attrs=True):
19:22:46 reg_ts_sims = (ds_ens * weight_masks).sum(dim=['lat','lon'])
19:22:46 reg_ts = xens.ensemble_percentiles(reg_ts_sims)
19:22:46 reg_ts.load()
19:22:46
19:22:46 # get only tx_mean percentile variables for this plot
19:22:46 vars1 = [v for v in reg_ts if 'tx_mean' in v]
19:22:46 # plot a simple map of the sub-regions
19:22:46 display(gdf.hvplot(geo=True, color='region_name',tiles='EsriImagery', legend=False, frame_width=400))
19:22:46 # Interative time-series plot of regional means
19:22:46 reg_ts[vars1].hvplot.line(x='year', title='time series of regional mean conditions')\
19:22:46 .opts(legend_position='top_left', frame_width=500)
(...)
19:22:46 /opt/conda/envs/birdy/lib/python3.7/site-packages/shapely/geometry/base.py in array_interface_base(self)
19:22:46 324 "removed in Shapely 2.0.",
19:22:46 325 ShapelyDeprecationWarning, stacklevel=2)
19:22:46 --> 326 return self._array_interface_base()
19:22:46 327
19:22:46 328 @property
19:22:46
19:22:46 TypeError: 'dict' object is not callable
Also extra warnings that also fail Jenkins from climex.ipynb
(should I find a way to silence those warnings or can someone fix the notebook code to avoid those warnings?):
19:22:46 _________ pavics-sdi-master/docs/source/notebooks/climex.ipynb::Cell 7 _________
19:22:46 Notebook cell execution failed
19:22:46 Cell 7: Cell outputs differ
19:22:46
19:22:46 Input:
19:22:46 fig = plt.figure(figsize=(8, 4))
19:22:46
19:22:46 ax = plt.subplot(1, 1, 1, projection=rotp)
19:22:46 ax.coastlines()
19:22:46 ax.gridlines()
19:22:46 m = ax.pcolormesh(out.rlon, out.rlat, out.mean(dim="realization").isel(time=0))
19:22:46 plt.colorbar(m, orientation='horizontal', label=sdii.long_name, fraction=0.046, pad=0.04)
19:22:46 ax.set_title("Ensemble mean")
19:22:46
19:22:46 Traceback:
19:22:46 Unexpected output fields from running code: {'stderr'}
/opt/conda/envs/birdy/lib/python3.7/site-packages/cartopy/crs.py:825: ShapelyDeprecationWarning: __len__ for multi-part geometries is deprecated and will be removed in Shapely 2.0. Check the length of the `geoms` property instead to get the number of parts of a multi-part geometry.
if len(multi_line_string) > 1:
/opt/conda/envs/birdy/lib/python3.7/site-packages/cartopy/crs.py:877: ShapelyDeprecationWarning: Iteration over multi-part geometries is deprecated and will be removed in Shapely 2.0. Use the `geoms` property to access the constituent parts of a multi-part geometry.
for line in multi_line_string:
/opt/conda/envs/birdy/lib/python3.7/site-packages/cartopy/crs.py:944: ShapelyDeprecationWarning: __len__ for multi-part geometries is deprecated and will be removed in Shapely 2.0. Check the length of the `geoms` property instead to get the number of parts of a multi-part geometry.
if len(p_mline) > 0:
/opt/conda/envs/birdy/lib/python3.7/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)
New Jupyter env is deployed to https://medus.ouranos.ca/jupyter/ for testing/fixing those notebooks.
@tlvu Is the jupyter-conda plugin still working on your side with the image pavics/workflow-tests:211221? I couldn't test it on https://medus.ouranos.ca/jupyter/ since I don't have access, but with a local birdhouse stack, I get an error with the plugin where it fails to retrieve available packages. It does find the list of installed packages but fails to find the packages in the "Not Installed" section found in the extension's tab (Settings -> Conda Packages Manager).
I tested with the preceding image tagged 211123 too, and I did not have a problem there.
@ChaamC this is very odd. I confirmed I reproduced your behavior with this new build but weirdly the version of mamba_gator
is still the same 5.1.2
between the previous build and this new build so unless the switch to mamba
installer did this, I am not sure why.
@ChaamC this is very odd. I confirmed I reproduced your behavior with this new build but weirdly the version of
mamba_gator
is still the same5.1.2
between the previous build and this new build so unless the switch tomamba
installer did this, I am not sure why.
@tlvu I am not sure either of the exact cause of this behaviour. I saw, by looking at my browser's developer tools, some info on the response from the request that seems to fail :
command: "/opt/conda/condabin/mamba repoquery search * --json"
conda_info: {GID: 1000, UID: 1000, active_prefix: "/opt/conda/envs/birdy", active_prefix_name: "birdy",…}
error: "RuntimeError('LockFile error. Aborting.')"
exception_name: "RuntimeError"
exception_type: "<class 'RuntimeError'>"
traceback: "Traceback (most recent call last):
File \"/opt/conda/lib/python3.9/site-packages/conda/exceptions.py\", line 1080, in __call__
return func(*args, **kwargs)
File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 917, in exception_converter
raise e
File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 911, in exception_converter
exit_code = _wrapped_main(*args, **kwargs)
File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 869, in _wrapped_main
result = do_call(args, p)
File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 744, in do_call
exit_code = repoquery(args, parser)
File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 686, in repoquery
pool = repoquery_api.create_pool(channels, platform, use_installed)
File \"/opt/conda/lib/python3.9/site-packages/mamba/repoquery.py\", line 47, in create_pool
load_channels(
File \"/opt/conda/lib/python3.9/site-packages/mamba/utils.py\", line 122, in load_channels
index = get_index(
File \"/opt/conda/lib/python3.9/site-packages/mamba/utils.py\", line 103, in get_index
is_downloaded = dlist.download(True)
RuntimeError: LockFile error. Aborting.
"
Not sure exactly how it happens. I wonder if it's a thing related to file permissions. I had some trouble with that when I added the conda extension to this repo. But by looking at the PR's code, permissions seems still to be handled properly...
command: "/opt/conda/condabin/mamba repoquery search --json" conda_info: {GID: 1000, UID: 1000, active_prefix: "/opt/conda/envs/birdy", active_prefix_name: "birdy",…} error: "RuntimeError('LockFile error. Aborting.')" exception_name: "RuntimeError" exception_type: "<class 'RuntimeError'>" traceback: "Traceback (most recent call last): File \"/opt/conda/lib/python3.9/site-packages/conda/exceptions.py\", line 1080, in call return func(args, *kwargs) File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 917, in exception_converter raise e File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 911, in exception_converter exit_code = _wrapped_main(args, **kwargs) File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 869, in _wrapped_main result = do_call(args, p) File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 744, in do_call exit_code = repoquery(args, parser) File \"/opt/conda/lib/python3.9/site-packages/mamba/mamba.py\", line 686, in repoquery pool = repoquery_api.create_pool(channels, platform, use_installed) File \"/opt/conda/lib/python3.9/site-packages/mamba/repoquery.py\", line 47, in create_pool load_channels( File \"/opt/conda/lib/python3.9/site-packages/mamba/utils.py\", line 122, in load_channels index = get_index( File \"/opt/conda/lib/python3.9/site-packages/mamba/utils.py\", line 103, in get_index is_downloaded = dlist.download(True) RuntimeError: LockFile error. Aborting. "
Oh wow, how did you managed to get this output? This is from this PAVICS Jupyter image?
Oh wow, how did you managed to get this output? This is from this PAVICS Jupyter image?
I used a local birdhouse-deploy stack, and started the image pavics/workflow-tests:211221
with jupyterhub. Then when loading the conda extension tab, which lists the different available package, I was able to check the requests happening in Chrome Developer Tools.
I checked the different responses by checking the info of each requests. The birdy
requests are those that get the packages installed in my environment, which gave succesful response with a list of 500+ packages.
The packages
request gave the output I gave earlier with the error.
When I tried the image tagged 211123, which has a working conda extension, I could see the 9000+ available packages returned by that request.
@huard @tlogan2000 I downgraded shapely
from 1.8.0
back to 1.7.1
and both the climex.ipynb
and the homepage notebook 5 Jenkins error mentioned in comment https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/pull/95#issuecomment-999216646 are gone.
Can you help fix those 2 notebooks? You can use the beta
env on PAVICS to fix those notebooks.
FYI @ChaamC I am at the end of road with this jupyter-conda plugin problem, I've opened an issue on their side to get more help https://github.com/mamba-org/gator/issues/170
@tlogan2000
I pinned shapely to the old 1.7.1 version and there is a new failure with the homepage nb 5: http://jenkins.ouranos.ca/job/PAVICS-e2e-workflow-tests/job/prevent-manual-pin-of-dependencies/14/consoleFull
12:36:36 File /opt/conda/envs/birdy/lib/python3.9/site-packages/holoviews/core/data/xarray.py:224, in XArrayInterface.init(cls, eltype, data, kdims, vdims)
12:36:36 220 undeclared = [
12:36:36 221 c for c in da.coords if c not in kdims and len(da[c].shape) == 1 and
12:36:36 222 da[c].shape[0] > 1]
12:36:36 223 if undeclared:
12:36:36 --> 224 raise DataError(
12:36:36 225 'The coordinates on the %r DataArray do not match the '
12:36:36 226 'provided key dimensions (kdims). The following coords '
12:36:36 227 'were left unspecified: %r. If you are requesting a '
12:36:36 228 'lower dimensional view such as a histogram cast '
12:36:36 229 'the xarray to a columnar format using the .to_dataframe '
12:36:36 230 'or .to_dask_dataframe methods before providing it to '
12:36:36 231 'HoloViews.' % (vdim.name, undeclared))
12:36:36 232 return data, {'kdims': kdims, 'vdims': vdims}, {}
12:36:36
12:36:36 DataError: The coordinates on the 'tx_mean' DataArray do not match the provided key dimensions (kdims). The following coords were left unspecified: ['horizon']. If you are requesting a lower dimensional view such as a histogram cast the xarray to a columnar format using the .to_dataframe or .to_dask_dataframe methods before providing it to HoloViews.
A good news is the bokeh/holoviews performance problem seems to be fixed in the new build.
This new build is deployed as "beta" image in prod.
I'll take a look.
@ChaamC FYI the jupyter-conda plugin is fixed by uninstalling mamba from the image !
Overview
Previously, when xclim and ravenpy were pinning their own dependencies, the pins were ignored and we had to manually repeat the same pins again. See comment https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/pull/94#issuecomment-996841873.
This PR allows xclim and ravenpy to manage their own dependencies pinning transparently during this Jupyter env rebuild.
Also fixed a long standing build performance along the way. Build time went from 50 mins to 25 mins and builds on DockerHub works again (fixes https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/issues/51).
Deployed as "beta" image on https://pavics.ouranos.ca/jupyter for testing.
Changes
Switched to using mamba instead of conda since mamba dependency solver is faster. Mamba solver being faster at the expense of less precision so had to pin latest xclim and ravenpy to avoid random downgrade in the 2nd build phase.
Both solvers performance seem to drop exponentially when less packages are specified directly, leading to more work for the solver to discover them. Less packages specified directly because we removed all direct dependencies of xclim and ravenpy from the
environment.yml
file.Switched to using 2 stages conda env build as another performance work-around. One single
conda env create -f /environment.yml
was taking many days ! mamba was not much better in one single stage build.With 2 stages build, a build using conda solver takes 5 hours while mamba solver takes 25 minutes !
Reduced the number of "build layers" by merging several of them, for another small build performance gain.
jupyterlab-topbar-text
andjupyterlab-theme-toggle
jupyterlab extension was removed due to javascript build problem. The topbar text was pretty useless. Hopefully the theme toogle is not so widely used.Had to hardcode the commit of the https://github.com/jupyter/docker-stacks repo where we get the startup script from because the latest version of those scripts are breaking us. This will have to be solve later.
Removed
vcs
library fromcdat
channel in order to move to python 3.9. Otherwise we are stuck on 3.7 and xarray will drop 3.7 soon. I've opened an issue on CDAT side https://github.com/CDAT/vcs/issues/457.vcs
library was needed to run ESGF notebooks at https://github.com/ESGF/esgf-compute-api/tree/devel/examplesRelated Issue / Discussion
Related issues https://github.com/jupyterlab/jupyterlab/issues/11726
Notebook fix needed https://github.com/Ouranosinc/PAVICS-landing/pull/42
Matching PR to deploy this new Jupyter env to PAVICS https://github.com/bird-house/birdhouse-deploy/pull/234
Additional Information
Screenshot of UI change showing
jupyterlab-topbar-text
andjupyterlab-theme-toggle
jupyterlab extension removed:Relevant changes:
< - ravenpy=0.7.5=pyhff6ddc9_0
< - python=3.7.12=hb7a2778_100_cpython
removed
< - vcs=8.2.1=pyh9f0ad1d_0
< - numpy=1.21.4=py37h31617e3_0
< - xarray=0.20.1=pyhd8ed1ab_0
< - rioxarray=0.8.0=pyhd8ed1ab_0
< - cf_xarray=0.6.1=pyh6c4a22f_0
< - gdal=3.3.2=py37hd5a0ba4_2
< - rasterio=1.2.6=py37hc20819c_2
< - climpred=2.1.6=pyhd8ed1ab_1
< - clisops=0.7.0=pyh6c4a22f_0
< - xesmf=0.6.0=pyhd8ed1ab_0
< - birdy=v0.8.0=pyh6c4a22f_1
< - cartopy=0.20.0=py37hbe109c4_0
< - dask=2021.11.2=pyhd8ed1ab_0
< - numba=0.53.1=py37hb11d6e1_1
< - pandas=1.3.4=py37he8f5f7f_1