nubank / fklearn

fklearn: Functional Machine Learning
Apache License 2.0
1.51k stars 165 forks source link

Add ascending parameter causal validation #220

Open MarianaBlaz opened 1 year ago

MarianaBlaz commented 1 year ago

Instructions

Status

READY

Todo list

Background context

In the causal validation module and the curves file, it would be useful to add an ascending parameter for the cumulative effect and cumulative gain curves.

The current state is to order predictions descending:

ordered_df = df.sort_values(prediction, ascending=False).reset_index(drop=True) If we add an ascending: bool = False argument to the cumulative_effect_curve, cumulative_gain_curve, relative_cumulative_gain_curve, and effect_curves, a user could modify how these effects are computed, whether to do them ascending or descending by the prediction column.

Description of the changes proposed in the pull request

A model could output a prediction that is not necessarily positively related to the effect to be computed, so adding an option to order this relationship differently allows for effects and gains with negatively related predictions and outcomes to be computed adequately.

The changes are applied to curves.py and also on auc.py on the causal-effect module.

Where should the reviewer start?

Reviewing causal-effect/curves as there are the definition of the functions from which all ordering behavior is propagated.