Closed camerondavison closed 8 years ago
Hmm according to http://dmg.org/pmml/v4-2-1/TreeModel.html I thought that the PMML document would need to have a target field in it, but looks like the jpmml-evaluator works without it. I found another bug in my code that accounted for the exception above.
According to the PMML specification, "The definition of target fields is not required since they do not have an impact on scoring results. For supervised models, however, the definition of target fields is often useful for documentation purposes".
Currently, when dealing with segmentation models (eg. Bagging, ExtraTrees, RandomForest) then the top-level MiningModel element defines a target field, whereas the member TreeModel elements don't. One way to look at things is that the scores of member TreeModel elements exist only in "local scope", so there's no point in naming them.
I'm working on extending the JPMML-SkLearn library so that it would be possible to assign names to all target fields, regardless of their position in the hierarchy. For example, this is needed for building ensembles of ensemble models (eg. the VotingClassifier
model type).
Also, let me guess - you got a null
result, because you were accidentally feeding a null
active field value to the evaluator? The default missingValueHandlingStrategy
of TreeModel element is none
, which triggers the default noTrueChildStrategy
of returnNullPrediction
.
Will have to check what is the Scikit-Learn's policy here. Probably, the evaluation should fail with an exception instead.
Thanks for the descriptive response. Yes. I was passing a null value for a LABEL field and then it was triggering the noTrue and then nullPrediction
getting
Since aggregateValues is getting back a null result from
which to me seems to imply that the trees are not returning target values.
It looks like some of the recent refactoring in https://github.com/jpmml/jpmml-sklearn/commit/27858a1e9794c8bbc976047749dc85281057c112#diff-d4ca34d7102c57121516753b9faf5e41 where the standalone variable was used to set the target field to something only when true, but https://github.com/jpmml/jpmml-sklearn/commit/27858a1e9794c8bbc976047749dc85281057c112#diff-b6e00c7675e0a9b5c3c0432ddf12c47eL126 was always setting the target field no matter what the standalone variable said. May have something to do with it? Really just from my sort of glancing through the code.