how to evaluate the model

wayfair / pylift

Uplift modeling package.

http://pylift.readthedocs.io

BSD 2-Clause "Simplified" License

368 stars 76 forks source link

how to evaluate the model #33

Open lprzqxsnoopy opened 4 years ago

lprzqxsnoopy commented 4 years ago

hello, I have two questions and hope that you can help me. 1) I want to know when we have got an uplift model and every customer has got a score. how to compare with the reponse model . 2) AUC is a common evaluation. When AUC reaches 0.8, we will think that the model works well. So, when Qini reaches what value, the model works better?

rsyi commented 4 years ago

You can use the UpliftEval class to evaluate the cumulative gains curve on any ranking. In this case, simply use your response model to predict on a test set, and initiate UpliftEval using those predictions.
There are three values of qini that we use, depending on the normalization. They are outlined here: https://pylift.readthedocs.io/en/latest/evaluation.html There are no set rules for qini values that correspond to good models, as it's going to depend heavily on the composition of your dataset. If there are lots of sleeping dogs, for example, you can get a much higher qini score. In the normalized cases (q1 and q2), this corresponds to a value closer to 1. You should instead try to aim for a cumulative gains curve that is sufficiently far from the random targeting line.

lprzqxsnoopy commented 4 years ago

Thanks for answering~ I also have a question that Is qini of training set very close to qini of testing set representing model stability?

lprzqxsnoopy commented 4 years ago

another quetion: Can I replace the model with LGBM?

shaddyab commented 4 years ago

another quetion: Can I replace the model with LGBM?

Think you should as long as it has an API similar to sklearn. You can specify your model using the 'sklearn_model' argument. See https://github.com/wayfair/pylift/blob/master/examples/simulated_data/sample.ipynb

lprzqxsnoopy commented 4 years ago

I have a uplift model and have two versions of the results: Model A: the aqini of train result is 0.4415 and the aqini of test result is 0.1675. Model B: the aqini of train result is 0.2003 and the test result is 0.2468. Can I regard the model A as over-fitting?
Is the model B better than model A?

rsyi commented 4 years ago

Can I regard the model A as over-fitting?

Yes.

Is the model B better than model A?

Yes, probably, but you should look at the cumulative gains curve is higher for model B for the proportion of people you want to target.