PythonPredictions / cobra

A Python package to build predictive linear and logistic regression models focused on performance and interpretation
https://pythonpredictions.github.io/cobra.io
MIT License
30 stars 6 forks source link

Add standard deviation to PIG plots in regression case #182

Open joostneuj opened 1 year ago

joostneuj commented 1 year ago

Add standard deviation to PIG plots in regression case

Task: Adding standard deviation in each bin to PIG plots.

Task Description

With minor modifications to the code, it would be possible to include standard deviations to PIG plots. In the case of regression, it could be useful to visually inspect what the variance is within one bin. I would suggest to not plot it by default, but make it available to the users who want (using a specific argument)? Output could then look like this:

std_dev_example

In the function compute_pig_table, the aggregation should be extended to also calculate the standard deviation as follows: res = (basetable.groupby(predictor_column_name) .agg( avg_target = (target_column_name, "mean"), pop_size = (target_column_name, "size"), std_dev_target = (target_column_name, "std"), ) .reset_index() .rename( columns={predictor_column_name: "label"} ) )

And in the function plot_incidence, you can use ax.errorbar (with yerr being half the previously calculated standard deviation) instead of ax.plot.