ydataai / ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
https://docs.profiling.ydata.ai
MIT License
12.55k stars 1.69k forks source link

Feature Request: Use minimal=True to drop "has constant value" columns automatically and then continue with explorative=True #1052

Open i621148 opened 2 years ago

i621148 commented 2 years ago

Missing functionality

I am really enjoying the use of the new switch: minimal=True Ex: profile = ProfileReport(df, title="Pandas Profiling Report", minimal=True) This allows me to process a huge data set which I don't have enough memory to scrutinize.

By using the profile.to_widgets() I am able to see all values which say "has constant value"

I then manually remove all these values by using the df.drop('xxx', axis=1, inplace=True)

Then I can use the more powerful command version: profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)

I think it would be a good feature to add a switch such as explorative_minimal that did this automatically. Mostly I am really interested in creating the dendrogram.

Proposed feature

profile = ProfileReport(df, title="Pandas Profiling Report", explorative=explorative_minimal)

run profile = ProfileReport(df, title="Pandas Profiling Report", minimal=True)

save all values with "has constant value" to a dataset: constant_value

drop all values constant_value from the dataframe

profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)

Alternatives considered

I copy past the results from the profile.to_widgets() into a notepad file. Then using find/replace scripting create a "has constant value" dataframe.

Additional context

No response

fabclmnt commented 2 years ago

Thank you for your feature request, we have put your request under consideration.

@i621148 please consider updating your request title to a more detailed description.