First, thanks for saving so many hours of boilerplate EDA with this project. I appreciate it!
I ran into an issue with the missing data matrix rendering missing bits as 'blurred' bars that run into neighboring columns and rows:
For comparison, I made the same matrix plot directly with missingno in jupyter notebook:
The plots above were made with toy 10x10 data (code below), but when larger datasets are used (50k x 30, in my case), Firefox renders differently than Chrome/Edge (all bad, just differently bad). So I'm wondering if it's an encoding issue? This issue is consistent across to_widget vs to_notebook_iframe vs to_file("report.html") output,
"""
Test for issue XXX:
https://github.com/pandas-profiling/pandas-profiling/issues/XXX
"""
import pandas as pd
import numpy as np
import pandas_profiling
df = pd.DataFrame({i : [1.0]*10 for i in range(10)})
df.loc[:,5] = np.nan
df.loc[3:5, 7] = np.nan
df.loc[1,1] = np.nan
pandas_profiling.ProfileReport(df, correlations=None, interactions=None, duplicates=None).to_file("bad_render.html")
Version information:
Version information is essential in reproducing and resolving bugs. Please report:
Python version: 3.8.2
Environment: Same results on both jupyter notebook and lab. Running on Ubuntu 18.04.5 LTS virtual machine via Windows Subsystem for Linux on Windows 10
First, thanks for saving so many hours of boilerplate EDA with this project. I appreciate it!
I ran into an issue with the missing data matrix rendering missing bits as 'blurred' bars that run into neighboring columns and rows:
For comparison, I made the same matrix plot directly with missingno in jupyter notebook:
The plots above were made with toy 10x10 data (code below), but when larger datasets are used (50k x 30, in my case), Firefox renders differently than Chrome/Edge (all bad, just differently bad). So I'm wondering if it's an encoding issue? This issue is consistent across to_widget vs to_notebook_iframe vs to_file("report.html") output,
Have you seen this before? Here is the output html report
To Reproduce
Code:
Version information:
Version information is essential in reproducing and resolving bugs. Please report:
pip
:Click to expand Version information
``` dependencies: - _libgcc_mutex=0.1=conda_forge - _openmp_mutex=4.5=0_gnu - abseil-cpp=20200225.1=he1b5a44_2 - aiofiles=0.5.0=py_0 - aiohttp=3.6.2=py38h516909a_0 - altair=4.0.1=py_0 - appdirs=1.4.3=py_1 - arrow=0.15.6=py38h32f6830_1 - arrow-cpp=0.16.0=py38hd8d096e_1 - async-timeout=3.0.1=py_1000 - attrs=19.3.0=py_0 - aws-sdk-cpp=1.7.164=h1f8afcc_0 - backcall=0.1.0=py_0 - beautifulsoup4=4.9.1=py38h32f6830_0 - binaryornot=0.4.4=py_1 - black=19.10b0=py38_0 - bleach=3.1.3=pyh8c360ce_0 - bokeh=2.0.1=py38h32f6830_0 - boost-cpp=1.72.0=h8e57a91_0 - brotli=1.0.7=he1b5a44_1001 - brotlipy=0.7.0=py38h1e0a361_1000 - bzip2=1.0.8=h516909a_2 - c-ares=1.15.0=h516909a_1001 - ca-certificates=2020.6.20=hecda079_0 - certifi=2020.6.20=py38h32f6830_0 - cffi=1.14.0=py38hd463f26_0 - cftime=1.1.1.2=py38h8790de6_0 - chardet=3.0.4=py38h32f6830_1006 - click=7.1.1=pyh8c360ce_0 - cloudpickle=1.4.1=py_0 - colorama=0.4.3=py_0 - confuse=1.3.0=pyh9f0ad1d_0 - cookiecutter=1.7.2=pyh9f0ad1d_0 - cryptography=2.8=py38h766eaa4_2 - curl=7.68.0=hf8cf82a_0 - cycler=0.10.0=py_2 - cytoolz=0.10.1=py38h516909a_0 - dask=2.15.0=py_0 - dask-core=2.15.0=py_0 - dask-labextension=2.0.2=py_0 - dbus=1.13.6=he372182_0 - decorator=4.4.2=py_0 - defusedxml=0.6.0=py_0 - distributed=2.15.2=py38h32f6830_0 - entrypoints=0.3=py38h32f6830_1001 - expat=2.2.9=he1b5a44_2 - flake8=3.7.9=py38h32f6830_1 - fontconfig=2.13.1=h86ecdb6_1001 - freetype=2.10.1=he06d7ca_0 - fsspec=0.7.3=py_0 - gettext=0.19.8.1=hc5be6a0_1002 - gflags=2.2.2=he1b5a44_1002 - glib=2.58.3=py38h73cb85d_1003 - glog=0.4.0=he1b5a44_1 - grpc-cpp=1.27.3=h7397029_1 - gst-plugins-base=1.14.5=h0935bb2_2 - gstreamer=1.14.5=h36ae1b5_2 - h11=0.9.0=py_0 - h2=3.2.0=py38h32f6830_1 - hdf4=4.2.13=hf30be14_1003 - hdf5=1.10.5=nompi_h3c11f04_1104 - heapdict=1.0.1=py_0 - hpack=3.0.0=py_0 - hstspreload=2020.5.13=py_0 - htmlmin=0.1.12=py_1 - httpcore=0.10.2=py_0 - httpx=0.14.2=py_0 - hyperframe=5.2.0=py_0 - icu=64.2=he1b5a44_1 - idna=2.9=py_1 - imagehash=4.1.0=pyh9f0ad1d_0 - importlib-metadata=1.5.0=py38h32f6830_1 - importlib_metadata=1.5.0=1 - ipykernel=5.1.4=py38h5ca1d4c_0 - ipython=7.13.0=py38h23f93f0_1 - ipython_genutils=0.2.0=py_1 - ipywidgets=7.5.1=py_0 - jedi=0.16.0=py38h32f6830_1 - jinja2=2.11.1=py_0 - jinja2-time=0.2.0=py_2 - joblib=0.14.1=py_0 - jpeg=9c=h14c3975_1001 - json5=0.9.0=py_0 - jsonschema=3.2.0=py38h32f6830_1 - jupyter-server-proxy=1.5.0=py_0 - jupyter_client=6.1.0=py_0 - jupyter_contrib_core=0.3.3=py_2 - jupyter_contrib_nbextensions=0.5.1=py38_0 - jupyter_core=4.6.3=py38h32f6830_1 - jupyter_highlight_selected_word=0.2.0=py38_1000 - jupyter_latex_envs=1.4.6=py38_1000 - jupyter_nbextensions_configurator=0.4.1=py38_0 - jupyterlab=2.0.1=py_0 - jupyterlab_server=1.0.7=py_0 - kiwisolver=1.1.0=py38hbf85e49_1 - krb5=1.16.4=h2fd8d38_0 - ld_impl_linux-64=2.34=h53a641e_0 - libblas=3.8.0=14_openblas - libcblas=3.8.0=14_openblas - libclang=9.0.1=default_hde54327_0 - libcurl=7.68.0=hda55be3_0 - libedit=3.1.20170329=hf8c457e_1001 - libevent=2.1.10=h72c5cf5_0 - libffi=3.2.1=he1b5a44_1007 - libgcc-ng=9.2.0=h24d8f2e_2 - libgfortran-ng=7.3.0=hdf63c60_5 - libgomp=9.2.0=h24d8f2e_2 - libiconv=1.15=h516909a_1006 - liblapack=3.8.0=14_openblas - libllvm8=8.0.1=hc9558a2_0 - libllvm9=9.0.1=hc9558a2_0 - libnetcdf=4.7.4=nompi_h9f9fd6a_101 - libopenblas=0.3.7=h5ec1e0e_6 - libpng=1.6.37=hed695b0_1 - libprotobuf=3.11.4=h8b12597_0 - libsodium=1.0.17=h516909a_0 - libssh2=1.8.2=h22169c7_2 - libstdcxx-ng=9.2.0=hdf63c60_2 - libtiff=4.1.0=hc7e4089_6 - libuuid=2.32.1=h14c3975_1000 - libuv=1.34.0=h516909a_0 - libwebp-base=1.1.0=h516909a_3 - libxcb=1.13=h14c3975_1002 - libxkbcommon=0.10.0=he1b5a44_0 - libxml2=2.9.10=hee79883_0 - libxslt=1.1.33=h31b3aaa_0 - line_profiler=3.0.2=py38hc9558a2_0 - llvmlite=0.31.0=py38h4f45e52_1 - locket=0.2.0=py_2 - lxml=4.5.0=py38hbb43d70_1 - lz4-c=1.8.3=he1b5a44_1001 - markupsafe=1.1.1=py38h1e0a361_1 - matplotlib=3.2.1=0 - matplotlib-base=3.2.1=py38h2af1d28_0 - mccabe=0.6.1=py_1 - memory_profiler=0.57.0=py_0 - missingno=0.4.2=py_1 - mistune=0.8.4=py38h516909a_1000 - more-itertools=8.2.0=py_0 - msgpack-python=1.0.0=py38hbf85e49_1 - multidict=4.7.5=py38h1e0a361_1 - mypy=0.770=py_0 - mypy_extensions=0.4.3=py38h32f6830_1 - nbconvert=5.6.1=py38_0 - nbformat=5.0.4=py_0 - ncurses=6.1=hf484d3e_1002 - netcdf4=1.5.3=nompi_py38heb6102f_103 - networkx=2.5=py_0 - nodejs=13.10.1=hf5d1a2b_0 - notebook=6.0.3=py38_0 - nspr=4.25=he1b5a44_0 - nss=3.47=he751ad9_0 - numba=0.48.0=py38hb3f55d8_0 - numpy=1.18.1=py38h95a1406_0 - olefile=0.46=py_0 - openssl=1.1.1h=h516909a_0 - packaging=20.1=py_0 - pandas=1.0.3=py38hcb8c335_0 - pandas-profiling=2.9.0=pyh9f0ad1d_0 - pandoc=2.9.2=0 - pandocfilters=1.4.2=py_1 - parquet-cpp=1.5.1=2 - parso=0.6.2=py_0 - partd=1.1.0=py_0 - pathspec=0.7.0=py_0 - patsy=0.5.1=py_0 - pcre=8.44=he1b5a44_0 - pexpect=4.8.0=py38h32f6830_1 - phik=0.10.0=py_0 - pickleshare=0.7.5=py38h32f6830_1001 - pillow=7.1.2=py38h9776b28_0 - pip=20.0.2=py_2 - pluggy=0.13.1=py38_0 - poyo=0.5.0=py_0 - prometheus_client=0.7.1=py_0 - prompt-toolkit=3.0.4=py_0 - psutil=5.7.0=py38h1e0a361_1 - pthread-stubs=0.4=h14c3975_1001 - ptyprocess=0.6.0=py_1001 - py=1.8.1=py_0 - pyarrow=0.16.0=py38hd02d5f2_2 - pycodestyle=2.5.0=py_0 - pycparser=2.20=py_0 - pyflakes=2.1.1=py_0 - pygments=2.6.1=py_0 - pyopenssl=19.1.0=py_1 - pyparsing=2.4.6=py_0 - pyqt=5.12.3=py38hcca6a23_1 - pyrsistent=0.15.7=py38h1e0a361_1 - pysocks=1.7.1=py38h32f6830_1 - pytest=5.4.1=py38h32f6830_0 - python=3.8.2=h9d8adfe_4_cpython - python-dateutil=2.8.1=py_0 - python-dotenv=0.13.0=pyh9f0ad1d_0 - python-slugify=4.0.0=pyh9f0ad1d_1 - python_abi=3.8=1_cp38 - pytz=2019.3=py_0 - pywavelets=1.1.1=py38hab2c0dc_2 - pyyaml=5.3.1=py38h1e0a361_0 - pyzmq=19.0.0=py38ha71036d_1 - qt=5.12.5=hd8c4c69_1 - re2=2020.03.03=he1b5a44_0 - readline=8.0=hf8c457e_0 - regex=2020.2.20=py38h1e0a361_1 - requests=2.23.0=pyh8c360ce_2 - rfc3986=1.3.2=py_0 - rope=0.16.0=py_0 - scikit-learn=0.22.2.post1=py38hcdab131_0 - scipy=1.4.1=py38h18bccfc_2 - seaborn=0.11.0=0 - seaborn-base=0.11.0=py_0 - send2trash=1.5.0=py_0 - setuptools=46.0.0=py38h32f6830_2 - shellingham=1.3.2=py_0 - simpervisor=0.3=py_1 - six=1.14.0=py_1 - snappy=1.1.8=he1b5a44_1 - sniffio=1.1.0=py38h32f6830_2 - sortedcontainers=2.1.0=py_0 - soupsieve=2.0.1=py38h32f6830_0 - sqlite=3.30.1=hcee41ef_0 - statsmodels=0.12.0=py38h1e0a361_0 - tangled-up-in-unicode=0.0.6=pyh9f0ad1d_0 - tblib=1.6.0=py_0 - terminado=0.8.3=py38h32f6830_1 - testpath=0.4.4=py_0 - text-unidecode=1.3=py_0 - thrift-cpp=0.13.0=h62aa4f2_2 - tk=8.6.10=hed695b0_0 - toml=0.10.0=py_0 - toolz=0.10.0=py_0 - tornado=6.0.4=py38h1e0a361_1 - tqdm=4.48.2=pyh9f0ad1d_0 - traitlets=4.3.3=py38h32f6830_1 - typed-ast=1.4.1=py38h516909a_0 - typer=0.3.1=py_0 - typing_extensions=3.7.4.1=py38h32f6830_1 - unidecode=1.1.1=py_0 - urllib3=1.25.7=py38h32f6830_1 - visions=0.5.0=pyh9f0ad1d_0 - wcwidth=0.1.8=py_0 - webencodings=0.5.1=py_1 - wheel=0.34.2=py_1 - whichcraft=0.6.1=py_0 - widgetsnbextension=3.5.1=py38_0 - xarray=0.15.1=py_0 - xlrd=1.2.0=py_0 - xorg-libxau=1.0.9=h14c3975_0 - xorg-libxdmcp=1.1.3=h516909a_0 - xz=5.2.4=h516909a_1002 - yaml=0.2.2=h516909a_1 - yarl=1.3.0=py38h516909a_1000 - zeromq=4.3.2=he1b5a44_2 - zict=2.0.0=py_0 - zipp=3.1.0=py_0 - zlib=1.2.11=h516909a_1006 - zstd=1.4.4=h3b9ef0a_2 - pip: - aiometer==0.2.1 - anyio==1.3.0 - async-generator==1.10 - pyqt5-sip==4.19.18 - pyqtwebengine==5.12.1 ```
-->