Classifier: output proba instead of category

For the case of a single category (defaulted or not, bought or not), yes!

For the case of multiple categories (shopper type A,B,C, or D), you should be able to, but it will require a little bit of work.

To do this, go to pySetup/makePredictions.py. Look for where we actually get the predictions (where we invoke classifier.predict_proba: currently line 106 - 116). What you want to do now is get the predicted probabilities during the validationRound. If you get them during the non-validation round, it will break ensembler. As it is, this hack will break the final assembly of ensembler, but it will still train up a bunch of ensembled models, which you can then read through.

So in the else case (problemType != 'category'), simply change classifier.predict to classifier.predict_proba.

Now our ensemble algorithms will make probability predictions.

Again, this means that when ensembler goes to assemble together the predictions from all our ensemble algorithms, it will fail. But that's ok. Simply find which of those algorithms was most accurate, and use the results found in that file.

Again, this whole process is a bit of a hack for now. At some point I'll try to build this in as standard functionality in ensembler, but it will be increasing complexity substantially, and there are easier wins to focus on right now. If you're interested in taking a crack at building it in, I'd love that PR!

Thanks for using this, and filing issues. Let me know if you have any other questions!

ClimbsRocks / machineJS

Classifier: output proba instead of category #155