ydataai / ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
https://docs.profiling.ydata.ai
MIT License
12.52k stars 1.68k forks source link

Minimal config seems to be "persistent" across session #598

Closed diraol closed 3 years ago

diraol commented 4 years ago

It seems that once you use the "minimal" config, during that "runtime session" you can't go back using the "full" config. Somehow the "minimal" settings seem to be "persisted" somewhere and/or are not fully overwritten by the default config.

To Reproduce

  1. Load the data
  2. Run the "not minimal" profile report once.
  3. Run the "minimal" profile_report.
  4. Run the "not minimal" profile report again.

Data:

https://data.nasa.gov/api/views/gh4g-9sfh/rows.csv?accessType=DOWNLOAD

Code: Preferably, use this code format:

import pandas as pd
import pandas_profiling
from pandas_profiling.utils.cache import cache_file

file_name = cache_file(
    "meteorites.csv",
    "https://data.nasa.gov/api/views/gh4g-9sfh/rows.csv?accessType=DOWNLOAD",
)
df = pd.read_csv(file_name)
df['year'] = pd.to_datetime(df['year'], errors='coerce')
df.profile_report(progress_bar=False)
df.profile_report(progress_bar=False, minimal=True)
df.profile_report(progress_bar=False)

-->

Version information:

Click to expand Version information

``` absl-py==0.10.0 alabaster==0.7.12 albumentations==0.1.12 altair==4.1.0 argon2-cffi==20.1.0 asgiref==3.2.10 astor==0.8.1 astropy==4.0.1.post1 astunparse==1.6.3 async-generator==1.10 atari-py==0.2.6 atomicwrites==1.4.0 attrs==20.2.0 audioread==2.1.8 autograd==1.3 Babel==2.8.0 backcall==0.2.0 beautifulsoup4==4.6.3 bleach==3.2.1 blis==0.4.1 bokeh==2.1.1 Bottleneck==1.3.2 branca==0.4.1 bs4==0.0.1 CacheControl==0.12.6 cachetools==4.1.1 catalogue==1.0.0 certifi==2020.6.20 cffi==1.14.3 chainer==7.4.0 chardet==3.0.4 click==7.1.2 cloudpickle==1.3.0 cmake==3.12.0 cmdstanpy==0.9.5 colorlover==0.3.0 community==1.0.0b1 confuse==1.3.0 contextlib2==0.5.5 convertdate==2.2.2 coverage==3.7.1 coveralls==0.5 crcmod==1.7 cufflinks==0.17.3 cvxopt==1.2.5 cvxpy==1.0.31 cycler==0.10.0 cymem==2.0.3 Cython==0.29.21 daft==0.0.4 dask==2.12.0 dataclasses==0.7 datascience==0.10.6 debugpy==1.0.0rc2 decorator==4.4.2 defusedxml==0.6.0 descartes==1.1.0 dill==0.3.2 distributed==1.25.3 Django==3.1.1 dlib==19.18.0 dm-tree==0.1.5 docopt==0.6.2 docutils==0.16 dopamine-rl==1.0.5 earthengine-api==0.1.236 easydict==1.9 ecos==2.0.7.post1 editdistance==0.5.3 en-core-web-sm==2.2.5 entrypoints==0.3 ephem==3.7.7.1 et-xmlfile==1.0.1 fa2==0.3.5 fancyimpute==0.4.3 fastai==1.0.61 fastdtw==0.3.4 fastprogress==1.0.0 fastrlock==0.5 fbprophet==0.7.1 feather-format==0.4.1 filelock==3.0.12 firebase-admin==4.4.0 fix-yahoo-finance==0.0.22 Flask==1.1.2 folium==0.8.3 future==0.16.0 gast==0.3.3 GDAL==2.2.2 gdown==3.6.4 gensim==3.6.0 geographiclib==1.50 geopy==1.17.0 gin-config==0.3.0 glob2==0.7 google==2.0.3 google-api-core==1.16.0 google-api-python-client==1.7.12 google-auth==1.17.2 google-auth-httplib2==0.0.4 google-auth-oauthlib==0.4.1 google-cloud-bigquery==1.21.0 google-cloud-core==1.0.3 google-cloud-datastore==1.8.0 google-cloud-firestore==1.7.0 google-cloud-language==1.2.0 google-cloud-storage==1.18.1 google-cloud-translate==1.5.0 google-colab==1.0.0 google-pasta==0.2.0 google-resumable-media==0.4.1 googleapis-common-protos==1.52.0 googledrivedownloader==0.4 graphviz==0.10.1 grpcio==1.32.0 gspread==3.0.1 gspread-dataframe==3.0.8 gym==0.17.2 h5py==2.10.0 HeapDict==1.0.1 holidays==0.10.3 holoviews==1.13.4 html5lib==1.0.1 htmlmin==0.1.12 httpimport==0.5.18 httplib2==0.17.4 httplib2shim==0.0.3 humanize==0.5.1 hyperopt==0.1.2 ideep4py==2.0.0.post3 idna==2.10 image==1.5.32 ImageHash==4.1.0 imageio==2.4.1 imagesize==1.2.0 imbalanced-learn==0.4.3 imblearn==0.0 imgaug==0.2.9 importlib-metadata==2.0.0 imutils==0.5.3 inflect==2.1.0 iniconfig==1.0.1 intel-openmp==2020.0.133 intervaltree==2.1.0 ipykernel==4.10.1 ipython==5.5.0 ipython-genutils==0.2.0 ipython-sql==0.3.9 ipywidgets==7.5.1 itsdangerous==1.1.0 jax==0.2.0 jaxlib==0.1.55 jdcal==1.4.1 jedi==0.17.2 jieba==0.42.1 Jinja2==2.11.2 joblib==0.16.0 jpeg4py==0.1.4 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==6.1.7 jupyter-console==5.2.0 jupyter-core==4.6.3 jupyterlab-pygments==0.1.2 kaggle==1.5.8 kapre==0.1.3.1 Keras==2.4.3 Keras-Preprocessing==1.1.2 keras-vis==0.4.1 kiwisolver==1.2.0 knnimpute==0.1.0 korean-lunar-calendar==0.2.1 librosa==0.6.3 lightgbm==2.2.3 llvmlite==0.31.0 lmdb==0.99 lucid==0.3.8 LunarCalendar==0.0.9 lxml==4.2.6 Markdown==3.2.2 MarkupSafe==1.1.1 matplotlib==3.2.2 matplotlib-venn==0.11.5 missingno==0.4.2 mistune==0.8.4 mizani==0.6.0 mkl==2019.0 mlxtend==0.14.0 more-itertools==8.5.0 moviepy==0.2.3.5 mpmath==1.1.0 msgpack==1.0.0 multiprocess==0.70.10 multitasking==0.0.9 murmurhash==1.0.2 music21==5.5.0 natsort==5.5.0 nbclient==0.5.0 nbconvert==5.6.1 nbformat==5.0.7 nest-asyncio==1.4.1 networkx==2.5 nibabel==3.0.2 nltk==3.2.5 notebook==5.3.1 np-utils==0.5.12.1 numba==0.48.0 numexpr==2.7.1 numpy==1.18.5 nvidia-ml-py3==7.352.0 oauth2client==4.1.3 oauthlib==3.1.0 okgrade==0.4.3 opencv-contrib-python==4.1.2.30 opencv-python==4.1.2.30 openpyxl==2.5.9 opt-einsum==3.3.0 osqp==0.6.1 packaging==20.4 palettable==3.3.0 pandas==1.1.2 pandas-datareader==0.9.0 pandas-gbq==0.13.2 pandas-profiling==2.9.0 pandocfilters==1.4.2 panel==0.9.7 param==1.9.3 parso==0.7.1 pathlib==1.0.1 patsy==0.5.1 pexpect==4.8.0 phik==0.10.0 pickleshare==0.7.5 Pillow==7.0.0 pip-tools==4.5.1 plac==1.1.3 plotly==4.4.1 plotnine==0.6.0 pluggy==0.7.1 portpicker==1.3.1 prefetch-generator==1.0.1 preshed==3.0.2 prettytable==0.7.2 progressbar2==3.38.0 prometheus-client==0.8.0 promise==2.3 prompt-toolkit==1.0.18 protobuf==3.12.4 psutil==5.4.8 psycopg2==2.7.6.1 ptyprocess==0.6.0 py==1.9.0 pyarrow==0.14.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycocotools==2.0.2 pycparser==2.20 pyct==0.4.8 pydata-google-auth==1.1.0 pydot==1.3.0 pydot-ng==2.0.0 pydotplus==2.0.2 PyDrive==1.3.1 pyemd==0.5.1 pyglet==1.5.0 Pygments==2.6.1 pygobject==3.26.1 pymc3==3.7 PyMeeus==0.3.7 pymongo==3.11.0 pymystem3==0.2.0 PyOpenGL==3.1.5 pyparsing==2.4.7 pyrsistent==0.17.3 pysndfile==1.3.8 PySocks==1.7.1 pystan==2.19.1.1 pytest==3.6.4 python-apt==1.6.5+ubuntu0.3 python-chess==0.23.11 python-dateutil==2.8.1 python-louvain==0.14 python-slugify==4.0.1 python-utils==2.4.0 pytz==2018.9 pyviz-comms==0.7.6 PyWavelets==1.1.1 PyYAML==3.13 pyzmq==19.0.2 qtconsole==4.7.7 QtPy==1.9.0 regex==2019.12.20 requests==2.23.0 requests-oauthlib==1.3.0 resampy==0.2.2 retrying==1.3.3 rpy2==3.2.7 rsa==4.6 scikit-image==0.16.2 scikit-learn==0.22.2.post1 scipy==1.4.1 screen-resolution-extra==0.0.0 scs==2.1.2 seaborn==0.11.0 Send2Trash==1.5.0 setuptools-git==1.2 Shapely==1.7.1 simplegeneric==0.8.1 six==1.15.0 sklearn==0.0 sklearn-pandas==1.8.0 slugify==0.0.1 smart-open==2.2.0 snowballstemmer==2.0.0 sortedcontainers==2.2.2 spacy==2.2.4 Sphinx==1.8.5 sphinxcontrib-serializinghtml==1.1.4 sphinxcontrib-websupport==1.2.4 SQLAlchemy==1.3.19 sqlparse==0.3.1 srsly==1.0.2 statsmodels==0.10.2 sympy==1.1.1 tables==3.4.4 tabulate==0.8.7 tangled-up-in-unicode==0.0.6 tblib==1.7.0 tensorboard==2.3.0 tensorboard-plugin-wit==1.7.0 tensorboardcolab==0.0.22 tensorflow==2.3.0 tensorflow-addons==0.8.3 tensorflow-datasets==2.1.0 tensorflow-estimator==2.3.0 tensorflow-gcs-config==2.3.0 tensorflow-hub==0.9.0 tensorflow-metadata==0.24.0 tensorflow-privacy==0.2.2 tensorflow-probability==0.11.0 termcolor==1.1.0 terminado==0.9.1 testpath==0.4.4 text-unidecode==1.3 textblob==0.15.3 textgenrnn==1.4.1 Theano==1.0.5 thinc==7.4.0 tifffile==2020.9.3 toml==0.10.1 toolz==0.11.1 torch==1.6.0+cu101 torchsummary==1.5.1 torchtext==0.3.1 torchvision==0.7.0+cu101 tornado==5.1.1 tqdm==4.50.2 traitlets==4.3.3 tweepy==3.6.0 typeguard==2.7.1 typing-extensions==3.7.4.3 tzlocal==1.5.1 umap-learn==0.4.6 uritemplate==3.0.1 urllib3==1.24.3 vega-datasets==0.8.0 visions==0.5.0 wasabi==0.8.0 wcwidth==0.2.5 webencodings==0.5.1 Werkzeug==1.0.1 widgetsnbextension==3.5.1 wordcloud==1.5.0 wrapt==1.12.1 xarray==0.15.1 xgboost==0.90 xkit==0.0.0 xlrd==1.1.0 xlwt==1.3.0 yellowbrick==0.9.1 zict==2.0.0 zipp==3.2.0 ```

JiByungKyu commented 4 years ago

I have same issue and calling ProfileReport.clear_config() clear run time configuration and working well.