We were getting strange results for pctCorrectOfPredictPair with logistic regression, and I finally traced it to a conceptual confusion. Given how we use the output of predictPair, it should output either 0 or 1 and NOT, as the documentation states, the probability that row1 > row 2. At the very least, I should update the documentation, but this issue reminds me to consider the deeper issues. We may sometimes want to see the actual estimated probability-- should we create another function to access that? (And can we even create an implementation for heuristics like TTB?)
Another way to explain this: If you believe the probability of a coin coming up heads is 60%, then you maximize your accuracy by ALWAYS guessing heads. But when pctCorrectOfPredictPair was using the output of logRegModel, it was doing the equivalent of guessing heads 60% of the time and tails 40% of the time ("probability matching"). I need to make the difference between those two kinds of output clearer in the code.
We were getting strange results for pctCorrectOfPredictPair with logistic regression, and I finally traced it to a conceptual confusion. Given how we use the output of predictPair, it should output either 0 or 1 and NOT, as the documentation states, the probability that row1 > row 2. At the very least, I should update the documentation, but this issue reminds me to consider the deeper issues. We may sometimes want to see the actual estimated probability-- should we create another function to access that? (And can we even create an implementation for heuristics like TTB?)
Another way to explain this: If you believe the probability of a coin coming up heads is 60%, then you maximize your accuracy by ALWAYS guessing heads. But when pctCorrectOfPredictPair was using the output of logRegModel, it was doing the equivalent of guessing heads 60% of the time and tails 40% of the time ("probability matching"). I need to make the difference between those two kinds of output clearer in the code.