Open nitieaj opened 6 years ago
I like the idea, but I would suggest to think of a way this can be used as real tool for linear regressions or any other model, I mean how could/would you take the output for this module and use them ? Any thoughts ?
I think your idea is useful and handy in that, as you mentioned, if all columns of dataset is summarized in groups with feature labels and corresponding plots. I just want to advance it by adding the function returning as function can be set not as default so that users can choose.
The output of this module gives you a visual summary near the associated verbose summary.
Before you do regression usually ,you may want to visualize some stats like the distribution of the predictors(exposures), the range of values etc .I find sometimes its easier to have the visual summary together with the verbose summary/structure of the data.
Really like your idea. For large data frame, some summarize plots will be necessary. It also can be useful for data pre processing before any model using, like exploring data distribution, roughly finding outliers or bad data.
The purpose of this module would be to summarize all columns according to specified function of interest with respect to a class label .
Function returns output of specified function e.g fivenum,summary for all the columns grouped by the status label and adds corresponding plots associated with the columns e.g bar plot.
It should aid in visualizing the summarized results and the plot side by side