Closed tmbrye closed 2 years ago
@tmbrye The results are indeed correct, open the attached model, we can see the definitions of output fields:
<Output>
<OutputField dataType="string" feature="decision" name="prediction" optype="categorical" value="">
<Apply function="if">
<Apply function="greaterThan">
<FieldRef field="prediction_1"/>
<Constant dataType="double">0.35</Constant>
</Apply>
<Constant dataType="string">1</Constant>
<Constant dataType="string">0</Constant>
</Apply>
</OutputField>
<OutputField dataType="double" feature="probability" name="proba_0" optype="continuous" value="0"/>
<OutputField dataType="double" feature="probability" name="proba_1" optype="continuous" value="1"/>
</Output>
The threshold of prediction is 0.35, that means if the probability of 1 is greater than 0.35, the final prediction will be 1, so the records 0, 3, and 8 are correct
Thanks so much for your quick response! This model was passed along to me and I definitely didn't dive in as far as I should have to notice that. Appreciate your time.
When scoring a model I am seeing an inconsistency in the predicted results. When running the score, the prediction column will state it is predicting a 1 result yet the probability for 0 is higher than the probability for 1 in several cases in the data. The python code I used to test was as follows:
The model is a rather large model. Here is the output of the above code executed:
Note 0,3,and 8 are incorrect.
random_forest.pmml.zip .