Open steprandelli opened 3 years ago
Same question. I also don't understand the idea of counting perfect uplift in the perfect_uplift_curve, no descriptions anywhere
@steprandelli @Irek21 Thanks for your question!
Recall that in the classical uplift problem we are dealing with vectors, target
is the value of the target variable and treatment
is the value of influence (communication in marketing, treatment in medicine, etc.), which are binary.
Thus, we have only 4 different classes that we need to sort correctly ((1, 1), (0, 0), (0, 1), (1, 0)).
In order to understand what an ideal curve should look like, you need to understand in what order you need to arrange these 4 classes (pairs). Obviously, by moving observations inside each of the classes, the value of the curve will not change.
Let's call the ideal curve the curve with the maximum area under it. So, you need to understand how to rank 4 classes so that the area under the curve is maximal.
In the code, you can find an implementation of how these classes should be sorted. I hope someday we will add a section about metrics, in which there will be material about ideal curves.
If you describe the proofs of sorting these classes in more detail, we will be happy to add it to the user guide.
Many thanks to @kirrlix1994 for consultations on the metrics issues.
💡 Feature request
Hi! Perfect uplift is required to compute both perfect uplift curve and perfect qini curve. Why is the formula to generate the perfect uplift different? Does it make sense to unify the perfect uplift formula?
perfect uplift curve
perfect qini curve
perfect_uplift = y_true * treatment - y_true * (1 - treatment)