Closed vruusmann closed 6 years ago
Two possible solutions:
0.999999999999999
is considered "close enough" to 1.0
.1.0
. This could be easily implemented using the Visitor API from the JPMML-Model library. Of course, the real source of the problem is bad PMML producer software.
Here's a small command-line application to compute the size of "delta" in terms of ULPs:
public class Main {
static
public void main(String[] args){
double left = Double.parseDouble(args[0]);
double right = Double.parseDouble(args[1]);
double sum = left + right;
System.out.println("sum: " + sum);
double delta = 1d - sum;
System.out.println("delta: " + delta);
System.out.println("delta in ULPs: " + (delta / Math.ulp(1d)));
}
}
The first application run shows that there's currently a delta of 4 ULPs:
$ java Main 0.254716981132075 0.745283018867924
sum: 0.9999999999999991
delta: 8.881784197001252E-16
delta in ULPs: 4.0
The second application run shows that if these probability values were represented with an extra decimal place (eg. by appending 5
to both number literals), then the delta would be 0 ULP (and there would be no scoring problem):
$ java Main 0.2547169811320755 0.7452830188679245
sum: 1.0
delta: 0.0
delta in ULPs: 0.0
In conclusion, the PMML producer software has been emitting "imprecise" probability values.
The JPMML-Evaluator library keeps introducing more sanity checks. As a result, newer versions of JPMML-Evaluator may refuse to score PMML documents (by throwing an
InvalidFeatureException
) that were gladly accepted/tolerated by older versions.For example, JPMML-Evaluator version 1.3.7 (and newer) require that for classification-type tree models, the values of
ScoreDistribution@probability
attribute must sum exactly to1.0
for eachNode
element.The following
Node
element is considered to be invalid, because the sum of probabilities is0.999999999999999
not1.0
:The "offending" sanity check: https://github.com/jpmml/jpmml-evaluator/blob/master/pmml-evaluator/src/main/java/org/jpmml/evaluator/tree/TreeModelEvaluator.java#L361-L364