ydataai / ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
https://docs.profiling.ydata.ai
MIT License
12.38k stars 1.67k forks source link

Issue with Interactions Map Display #1640

Open cfedarkopoc opened 1 month ago

cfedarkopoc commented 1 month ago

Current Behaviour

Rather than the interactions map displaying as hexagons of varying shades, the interactions map is only displaying as points that do not vary in shade but will disappear depending on the variables. EDA Issue

Expected Behaviour

The map should allow me to see the varied interactions between variables based on shade and presence EDA Correct

Data Description

My data is a survey assessing perceptions of crime, here is a screenshot of my data Data

Code that reproduces the bug

import pandas as pd
!pip install ydata_profiling
from ydata_profiling import ProfileReport

profile = ProfileReport(df,title="Perceptions of Crime Survey")

# Save the report to .html
profile.to_file("POCSurvey_YesPV.html")

pandas-profiling version

v2.1.4

Dependencies

N/A

OS

Windows 10

Checklist

cfedarkopoc commented 3 weeks ago

This appeared in my code, I hadn't noticed before:

/usr/local/lib/python3.10/dist-packages/ydata_profiling/model/correlations.py:66: UserWarning: There was an attempt to calculate the auto correlation, but this failed. To hide this warning, disable the calculation (using df.profile_report(correlations={"auto": {"calculate": False}}) If this is problematic for your use case, please report this as an issue: https://github.com/ydataai/ydata-profiling/issues (include the error message: 'Function <code object pandas_auto_compute at 0x7a2b874c5420, file "/usr/local/lib/python3.10/dist-packages/ydata_profiling/model/pandas/correlations_pandas.py", line 167>') warnings.warn(

fabclmnt commented 2 weeks ago

Hi @cfedarkopoc ,

thank you for your report. regarding the correlation I advise you to open a separate issue, as we will need more details to understand your issue. In general terms, if your data has some behaviors that lead to error in one of the metrics used to compute the auto correlations you will get that error message. More on the correlations in our docs - https://docs.profiling.ydata.ai/latest/advanced_settings/available_settings/#correlations.

Regarding the interactions plot, that is an expected behavior as the number of rows of your dataset is smaller than the size set for the property config.plot.scatter_threshold, leading to the calculation of a scatter instead of a hex plot. You can set a different value by building the config file using a yaml, like the following example https://github.com/ydataai/ydata-profiling/blob/develop/src/ydata_profiling/config_minimal.yaml.