mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
3k stars 401 forks source link

Extend EDA to include bivariate analysis #176

Closed shahules786 closed 3 years ago

shahules786 commented 4 years ago

mljar EDA contains only a basic analysis on the distribution of each feature, the idea is to extend this to support bivariate analysis. for example, EDA.extend_eda(df,target) will analyze how each variable in data frame df changes with the target variable.

shahules786 commented 4 years ago

hey @pplonski I have made a PR https://github.com/mljar/mljar-supervised/pull/178

shahules786 commented 4 years ago

@pplonski I have made the requested changes. About choosing the number of features for heatmap (14), I checked some arbitrary values and also took the aesthetics into consideration as too many features in a figure of (10,10) will not look good. What do you think?