Open richlysakowski opened 1 year ago
3-5-exercise-normalize-data-predict-missing-values-WITH-XARRAY-UPDATE-HACK.zip Here is a conda environment HACK (in the notebook) to fix the the environment and get the notebook running.
The sandbox build script for the Microsoft environment needs to be updated to reflect recent updates in plotly and xarray.
Hey @richlysakowski, thanks a lot for taking the time to figure this out. For those that might be here looking for a quick and concise solution, I would just add that based on your solution above, all I had to do to get around this issue is add the line !pip install -U xarray!='2022.6.*'
at the top of the first cell of any notebook that contains the error related to the graphing.py import.
Hey @richlysakowski, thanks a lot for taking the time to figure this out. For those that might be here looking for a quick and concise solution, I would just add that based on your solution above, all I had to do to get around this issue is add the line
!pip install -U xarray!='2022.6.*'
at the top of the first cell of any notebook that contains the error related to the graphing.py import.
That is not working anymore unfortunately.
As of 18 months later this is still an issue.
(one of the) Training Module where problem exists:
https://learn.microsoft.com/en-us/training/modules/introduction-to-data-for-machine-learning/3-exercise-detect-visualize-missing-data
Please fix the sandbox creation script, test the training course notebooks and post updated code.
The installation script for the Microsoft VM sandbox needs to be updated to reflect updates in plotly and xarray. BELOW is a HACK to fix the notebook and get it running.
Microsoft's custom module 'graphing.py' extracts code snippets from Plotly, Dash, and XArray. These packages have since been updated and throw errors. It took a bit of work to find a fix and test it in the environment.
I had this problem with another training module in the same course. Lost 30+ minutes fixing it the first time. Don't have time to debug and fix it again right now. Will post updated notebooks with VM changes that show how to fix notebook and environment.
Here is the cell that throws the error. [7] import graphing
'graphing' is custom code we use to make graphs quickly. If you would like to read it in detail, it can be found in our GitHub repository graphing.histogram(dataset, 'Pclass', title='Ticket Class (All Passengers)', show=True) graphing.histogram(unknown_age_and_cabin, 'Pclass', title='Ticket Class (Passengers Missing Cabin and Age Information)') 5 sec AttributeError: module 'dask.array' has no attribute 'lib'
AttributeError Traceback (most recent call last) Input In [7], in <cell line: 1>() ----> 1 import graphing 3 # 'graphing' is custom code we use to make graphs quickly. 4 # If you would like to read it in detail, it can be found 5 # in our GitHub repository 6 graphing.histogram(dataset, 'Pclass', title='Ticket Class (All Passengers)', show=True)
File /learn/graphing.py:9, in
7 from numpy.core.fromnumeric import repeat, shape
8 import pandas
----> 9 import plotly.express as px
10 import plotly.io as pio
11 import plotly.graph_objects as graph_objects
File /anaconda/envs/py38_default/lib/python3.8/site-packages/plotly/express/init.py:15, in
9 if pd is None:
10 raise ImportError(
11 """\
12 Plotly express requires pandas to be installed."""
13 )
---> 15 from ._imshow import imshow
16 from ._chart_types import ( # noqa: F401
17 scatter,
18 scatter_3d,
(...)
51 density_mapbox,
52 )
55 from ._core import ( # noqa: F401
56 set_mapbox_access_token,
57 defaults,
58 get_trendline_results,
59 NO_COLOR,
60 )
File /anaconda/envs/py38_default/lib/python3.8/site-packages/plotly/express/_imshow.py:11, in
8 from plotly.utils import image_array_to_data_uri
10 try:
---> 11 import xarray
13 xarray_imported = True
14 except ImportError:
File /anaconda/envs/py38_default/lib/python3.8/site-packages/xarray/init.py:1, in
----> 1 from . import testing, tutorial
2 from .backends.api import (
3 load_dataarray,
4 load_dataset,
(...)
8 savemfdataset,
9 )
10 from .backends.rasterio import open_rasterio
File /anaconda/envs/py38_default/lib/python3.8/site-packages/xarray/testing.py:9, in
6 import numpy as np
7 import pandas as pd
----> 9 from xarray.core import duck_array_ops, formatting, utils
10 from xarray.core.dataarray import DataArray
11 from xarray.core.dataset import Dataset
File /anaconda/envs/py38_default/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:26, in
23 from numpy import take, tensordot, transpose, unravel_index # noqa
24 from numpy import where as _where
---> 26 from . import dask_array_compat, dask_array_ops, dtypes, npcompat, nputils
27 from .nputils import nanfirst, nanlast
28 from .pycompat import cupy_array_type, dask_array_type, is_duck_dask_array
File /anaconda/envs/py38_default/lib/python3.8/site-packages/xarray/core/dask_array_compat.py:60, in
56 return padded
59 if da is not None:
---> 60 sliding_window_view = da.lib.stride_tricks.sliding_window_view
61 else:
62 sliding_window_view = None
AttributeError: module 'dask.array' has no attribute 'lib' Azure_Intro_ML_JupyterNBs_with-Graphing_Module-Errors.zip