We decided to include a way to calculate the errors of the Cumulative Effect and Cumulative Gain Curves following the example presented in Causal Inference for the Brave and the True.
Description of the changes proposed in the pull request
We add a new type called error_fn which intends to be a general class of statistical error functions, we implement one function of this kind which is the linear_standard_error. We use this function to generate a curve function: cumulative_statistical_error_curve, analogous to cumulative_gain_curve and cumulative_effect_curve, that calculates the error (given by the error_fn) among a treatment and an outcome taking into account incremental pieces of an ordered dataframe. At the end we modify the effect_curves function to add an optional parameter in case one wishes to calculate the error of the cumulative gain curve and the cumulative effect curve. These error columns are intended to be used to generate Confidence Intervals of these curves.
Where should the reviewer start?
We suggest to start from the causal/validation/curves.py file, then check the causal/statistical_errors.py file.
Remaining problems or questions
We only wrote a function for linear relationships but we believe we did a general enough approach so it can be extended to other kinds of relationships.
Status
READY
Todo list
Background context
We decided to include a way to calculate the errors of the Cumulative Effect and Cumulative Gain Curves following the example presented in Causal Inference for the Brave and the True.
Description of the changes proposed in the pull request
We add a new type called
error_fn
which intends to be a general class of statistical error functions, we implement one function of this kind which is thelinear_standard_error
. We use this function to generate a curve function:cumulative_statistical_error_curve
, analogous tocumulative_gain_curve
andcumulative_effect_curve
, that calculates the error (given by the error_fn) among a treatment and an outcome taking into account incremental pieces of an ordered dataframe. At the end we modify theeffect_curves
function to add an optional parameter in case one wishes to calculate the error of the cumulative gain curve and the cumulative effect curve. These error columns are intended to be used to generate Confidence Intervals of these curves.Where should the reviewer start?
We suggest to start from the
causal/validation/curves.py
file, then check thecausal/statistical_errors.py
file.Remaining problems or questions
We only wrote a function for linear relationships but we believe we did a general enough approach so it can be extended to other kinds of relationships.