Closed mina-marmpena closed 2 years ago
Hi @mina-marmpena , thanks for proposing the issue. Could you try to downgrade the scipy version to 1.5.4?
I had the same problem. As suggested, I downgraded my scipy to 1.5.4, but another problem showed:
data[col]["stats"], data[col]["len_stats"], data[col]["letter_stats"]
KeyError: 'stats'
Hi @mapsiva , I think this is a data related error. Is it possible to share a data that can reproduce the error?
Describe the bug I get the following error when I try to run the example of creating a report:
error happended in column:PassengerId Traceback (most recent call last): File "", line 1, in File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dataprep/eda/create_report/init.py", line 68, in create_report "components": format_report(df, cfg, mode, progress), File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dataprep/eda/create_report/formatter.py", line 76, in format_report comps = format_basic(edaframe, cfg) File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dataprep/eda/create_report/formatter.py", line 274, in format_basic data, completions = basic_computations(df, cfg) File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dataprep/eda/create_report/formatter.py", line 383, in basic_computations variables_data = _compute_variables(df, cfg) File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dataprep/eda/create_report/formatter.py", line 318, in _compute_variables data[col] = cont_comps(df.frame[col], cfg) File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dataprep/eda/distribution/compute/univariate.py", line 200, in cont_comps data["chisq"] = chisquare(data["hist"][0]) File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dask/array/stats.py", line 136, in chisquare return power_divergence(f_obs, f_exp=fexp, ddof=ddof, axis=axis, lambda="pearson") File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/dask/array/stats.py", line 144, in powerdivergence if lambda not in scipy.stats.stats._power_div_lambda_names: File "/home/user/anaconda3/envs/test-data-prep/lib/python3.8/site-packages/scipy/stats/stats.py", line 54, in getattr raise AttributeError( AttributeError: scipy.stats.stats is deprecated and has no attribute _power_div_lambda_names. Try looking in scipy.stats instead.
To Reproduce
from dataprep.datasets import load_dataset df = load_dataset("titanic") from dataprep.eda import create_report report = create_report(df)
Expected behavior To get the EDA report.
Desktop (please complete the following information):
- OS: Ubuntu 20.04.4 LTS
- Platform [Python script]
- Platform Version [PyCharm 2021.3.2 (Community Edition)]
- Python Version [3.8.12]
- Dataprep Version [0.4.2]
Additional context I have tested in a fresh conda env with pip install dataprep. Here are the packages installed:
# Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 4.5 1_gnu aiohttp 3.8.1 pypi_0 pypi aiosignal 1.2.0 pypi_0 pypi argon2-cffi 21.3.0 pypi_0 pypi argon2-cffi-bindings 21.2.0 pypi_0 pypi asttokens 2.0.5 pypi_0 pypi async-timeout 4.0.2 pypi_0 pypi attrs 21.4.0 pypi_0 pypi backcall 0.2.0 pypi_0 pypi bleach 4.1.0 pypi_0 pypi bokeh 2.4.2 pypi_0 pypi ca-certificates 2021.10.26 h06a4308_2 certifi 2021.10.8 py38h06a4308_2 cffi 1.15.0 pypi_0 pypi charset-normalizer 2.0.12 pypi_0 pypi click 8.0.4 pypi_0 pypi cloudpickle 2.0.0 pypi_0 pypi cycler 0.11.0 pypi_0 pypi dask 2021.12.0 pypi_0 pypi dataprep 0.4.2 pypi_0 pypi debugpy 1.5.1 pypi_0 pypi decorator 5.1.1 pypi_0 pypi defusedxml 0.7.1 pypi_0 pypi entrypoints 0.4 pypi_0 pypi executing 0.8.2 pypi_0 pypi flask 2.0.3 pypi_0 pypi flask-cors 3.0.10 pypi_0 pypi fonttools 4.29.1 pypi_0 pypi frozenlist 1.3.0 pypi_0 pypi fsspec 2022.2.0 pypi_0 pypi idna 3.3 pypi_0 pypi importlib-resources 5.4.0 pypi_0 pypi ipykernel 6.9.1 pypi_0 pypi ipython 8.1.0 pypi_0 pypi ipython-genutils 0.2.0 pypi_0 pypi ipywidgets 7.6.5 pypi_0 pypi itsdangerous 2.1.0 pypi_0 pypi jedi 0.18.1 pypi_0 pypi jinja2 3.0.3 pypi_0 pypi joblib 1.1.0 pypi_0 pypi jsonpath-ng 1.5.3 pypi_0 pypi jsonschema 4.4.0 pypi_0 pypi jupyter-client 7.1.2 pypi_0 pypi jupyter-core 4.9.2 pypi_0 pypi jupyterlab-pygments 0.1.2 pypi_0 pypi jupyterlab-widgets 1.0.2 pypi_0 pypi kiwisolver 1.3.2 pypi_0 pypi ld_impl_linux-64 2.35.1 h7274673_9 levenshtein 0.16.0 pypi_0 pypi libffi 3.3 he6710b0_2 libgcc-ng 9.3.0 h5101ec6_17 libgomp 9.3.0 h5101ec6_17 libstdcxx-ng 9.3.0 hd4cf53a_17 locket 0.2.1 pypi_0 pypi markupsafe 2.1.0 pypi_0 pypi matplotlib 3.5.1 pypi_0 pypi matplotlib-inline 0.1.3 pypi_0 pypi metaphone 0.6 pypi_0 pypi mistune 0.8.4 pypi_0 pypi multidict 6.0.2 pypi_0 pypi nbclient 0.5.11 pypi_0 pypi nbconvert 6.4.2 pypi_0 pypi nbformat 5.1.3 pypi_0 pypi ncurses 6.3 h7f8727e_2 nest-asyncio 1.5.4 pypi_0 pypi nltk 3.7 pypi_0 pypi notebook 6.4.8 pypi_0 pypi numpy 1.22.2 pypi_0 pypi openssl 1.1.1m h7f8727e_0 packaging 21.3 pypi_0 pypi pandas 1.4.1 pypi_0 pypi pandocfilters 1.5.0 pypi_0 pypi parso 0.8.3 pypi_0 pypi partd 1.2.0 pypi_0 pypi pexpect 4.8.0 pypi_0 pypi pickleshare 0.7.5 pypi_0 pypi pillow 9.0.1 pypi_0 pypi pip 21.2.4 py38h06a4308_0 ply 3.11 pypi_0 pypi prometheus-client 0.13.1 pypi_0 pypi prompt-toolkit 3.0.28 pypi_0 pypi ptyprocess 0.7.0 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi pycparser 2.21 pypi_0 pypi pydantic 1.9.0 pypi_0 pypi pygments 2.11.2 pypi_0 pypi pyparsing 3.0.7 pypi_0 pypi pyrsistent 0.18.1 pypi_0 pypi python 3.8.12 h12debd9_0 python-crfsuite 0.9.7 pypi_0 pypi python-dateutil 2.8.2 pypi_0 pypi python-stdnum 1.17 pypi_0 pypi pytz 2021.3 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 22.3.0 pypi_0 pypi rapidfuzz 1.8.3 pypi_0 pypi readline 8.1.2 h7f8727e_1 regex 2021.11.10 pypi_0 pypi scipy 1.8.0 pypi_0 pypi send2trash 1.8.0 pypi_0 pypi setuptools 58.0.4 py38h06a4308_0 six 1.16.0 pypi_0 pypi sqlite 3.37.2 hc218d9a_0 stack-data 0.2.0 pypi_0 pypi terminado 0.13.1 pypi_0 pypi testpath 0.6.0 pypi_0 pypi tk 8.6.11 h1ccaba5_0 toolz 0.11.2 pypi_0 pypi tornado 6.1 pypi_0 pypi tqdm 4.62.3 pypi_0 pypi traitlets 5.1.1 pypi_0 pypi typing-extensions 4.1.1 pypi_0 pypi varname 0.8.1 pypi_0 pypi wcwidth 0.2.5 pypi_0 pypi webencodings 0.5.1 pypi_0 pypi werkzeug 2.0.3 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0 widgetsnbextension 3.5.2 pypi_0 pypi wordcloud 1.8.1 pypi_0 pypi xz 5.2.5 h7b6447c_0 yarl 1.7.2 pypi_0 pypi zipp 3.7.0 pypi_0 pypi zlib 1.2.11 h7f8727e_4
I am also facing this issue now!!!
plot(claim_df)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
C:\Users\CON822~1.TML\AppData\Local\Temp/ipykernel_27972/2569825133.py in <module>
1 from dataprep.eda import plot
----> 2 plot(claim_df)
~\anaconda3\envs\medicalclaim_env\lib\site-packages\dataprep\eda\distribution\__init__.py in plot(df, col1, col2, col3, config, display, dtype, progress)
99
100 with ProgressBar(minimum=1, disable=not progress):
--> 101 itmdt = compute(df, col1, col2, col3, cfg=cfg, dtype=dtype)
102
103 to_render = render(itmdt, cfg)
~\anaconda3\envs\medicalclaim_env\lib\site-packages\dataprep\eda\distribution\compute\__init__.py in compute(df, col1, col2, col3, cfg, display, dtype)
70
71 if not any([x, y, z]):
---> 72 return compute_overview(df, cfg, dtype)
73
74 if sum(v is None for v in (x, y, z)) == 2:
~\anaconda3\envs\medicalclaim_env\lib\site-packages\dataprep\eda\distribution\compute\overview.py in compute_overview(df, cfg, dtype)
55 col_dtype = frame.get_eda_dtype(col)
56 if isinstance(col_dtype, Continuous) and (cfg.hist.enable or cfg.insight.enable):
---> 57 data.append((col, col_dtype, _cont_calcs(frame.frame[col].dropna(), cfg)))
58 elif isinstance(col_dtype, (Nominal, GeoGraphy, GeoPoint, SmallCardNum)) and (
59 cfg.bar.enable or cfg.insight.enable
~\anaconda3\envs\medicalclaim_env\lib\site-packages\dataprep\eda\distribution\compute\overview.py in _cont_calcs(srs, cfg)
123
124 if cfg.insight.enable:
--> 125 data["chisq"] = chisquare(data["hist"][0])
126 data["norm"] = normaltest(data["hist"][0])
127 data["skew"] = skewtest(data["hist"][0])
~\anaconda3\envs\medicalclaim_env\lib\site-packages\dask\array\stats.py in chisquare(f_obs, f_exp, ddof, axis)
134 @derived_from(scipy.stats)
135 def chisquare(f_obs, f_exp=None, ddof=0, axis=0):
--> 136 return power_divergence(f_obs, f_exp=f_exp, ddof=ddof, axis=axis, lambda_="pearson")
137
138
~\anaconda3\envs\medicalclaim_env\lib\site-packages\dask\array\stats.py in power_divergence(f_obs, f_exp, ddof, axis, lambda_)
142 if isinstance(lambda_, str):
143 # TODO: public api
--> 144 if lambda_ not in scipy.stats.stats._power_div_lambda_names:
145 names = repr(list(scipy.stats.stats._power_div_lambda_names.keys()))[1:-1]
146 raise ValueError(
~\AppData\Roaming\Python\Python38\site-packages\scipy\stats\stats.py in __getattr__(name)
52 def __getattr__(name):
53 if name not in __all__:
---> 54 raise AttributeError(
55 "scipy.stats.stats is deprecated and has no attribute "
56 f"{name}. Try looking in scipy.stats instead.")
AttributeError: scipy.stats.stats is deprecated and has no attribute _power_div_lambda_names. Try looking in scipy.stats instead.
Hi @mingjun1120 , looks the scipy version is 1.8.0 in your environment. Could you try to downgrade it to 1.5.4?
I have tried but also having same problem. Another problem is there are some packages I used required higher version of scipy
@jinglinpeng I try scipy==1.7.1
can work
@mingjun1120 thanks for the update! I'll add the version in the config file.
@jinglinpeng kind reminder. I had the same problem and fixed it by downgrading scipy.
This worked for me, thank you.
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Jinglin Peng @.> Sent: Saturday, February 26, 2022 6:13:47 AM To: sfu-db/dataprep @.> Cc: Mina Marmpena @.>; Mention @.> Subject: Re: [sfu-db/dataprep] Error concerning scipy.stats.stats when creating a report (Issue #840)
Hi @mina-marmpenahttps://github.com/mina-marmpena , could you try to degrade the scipy version to 1.5.4?
— Reply to this email directly, view it on GitHubhttps://github.com/sfu-db/dataprep/issues/840#issuecomment-1051561516, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWXMXND65JBVPWQLKPK6R3TU5BHPXANCNFSM5PKUK5IA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>
@eloukas @mina-marmpena thanks for the information! I'll add the version in the config file and it should be good in the next release (March 28th).
Describe the bug I get the following error when I try to run the example of creating a report:
To Reproduce
Expected behavior To get the EDA report.
Desktop (please complete the following information):
Additional context I have tested in a fresh conda env with pip install dataprep. Here are the packages installed: