jpmml / jpmml-transpiler

Java Transpiler (Translator + Compiler) API for PMML
GNU Affero General Public License v3.0
28 stars 2 forks source link

Invalid regression normalization method #14

Closed vruusmann closed 2 years ago

vruusmann commented 3 years ago

Extracted from https://github.com/jpmml/jpmml-transpiler/issues/13#issuecomment-856737434

Quote: Thanks for your help! I manage to transpile a XGBoost Model by following your instruction.

But I am getting this error when doing evaluation

Caused by: java.lang.IllegalArgumentException
    at org.jpmml.evaluator.regression.RegressionModelUtil.normalizeBinaryLogisticClassificationResult(RegressionModelUtil.java:198)
    at org.jpmml.evaluator.regression.RegressionModelUtil.computeBinomialProbabilities(RegressionModelUtil.java:46)
    at MyCompany.XgbModel$JavaModel$108982313.evaluateRegressionTableList$1250816994(XgbModel.java:12752)
    at MyCompany.XgbModel$JavaModel$108982313.evaluateClassification(XgbModel.java:12770)
    at org.jpmml.evaluator.java.JavaModelEvaluator.evaluateClassification(JavaModelEvaluator.java:59)
    at org.jpmml.evaluator.ModelEvaluator.evaluateInternal(ModelEvaluator.java:449)
    at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateSegmentation(MiningModelEvaluator.java:539)
    at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateClassification(MiningModelEvaluator.java:304)
    at org.jpmml.evaluator.ModelEvaluator.evaluateInternal(ModelEvaluator.java:449)
    at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateInternal(MiningModelEvaluator.java:237)
    at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:302)

Which is from this line of code Map<FieldName, ?> results = evaluator.evaluate(arguments);

I am doing just fine for the same evaluation code when using the untranspiled .pmml model. Could you please help?

thu-zxs commented 3 years ago

Quote:

In brief, you are using a binary classification model, which probably specifies RegressionModel@normalizationMethod="softmax"?

Exactly, I am using softmax:

<Segment id="2">
    <True/>
    <RegressionModel functionName="classification" normalizationMethod="softmax">
        <MiningSchema>
            <MiningField name="_target" usageType="target"/>
            <MiningField name="xgbValue"/>
        </MiningSchema>
        <Output>
            <OutputField dataType="double" feature="probability" name="probability_0" optype="continuous" value="0"/>
            <OutputField dataType="double" feature="probability" name="probability_1" optype="continuous" value="1"/>
        </Output>
        <RegressionTable intercept="0.0" targetCategory="0">
            <NumericPredictor coefficient="-1.0" name="xgbValue"/>
        </RegressionTable>
        <RegressionTable intercept="0.0" targetCategory="1"/>
    </RegressionModel>
</Segment>
thu-zxs commented 3 years ago

Thank you for pointing out the source. I modified the code located in src/main/java/org/jpmml/translator/regression/RegressionModelTranslator.java line 194-196 to:

if(regressionTables.size() == 2 && !normalizationMethod.equals(RegressionModel.NormalizationMethod.SOFTMAX)){
       valueMapBuilder.staticUpdate(RegressionModelUtil.class, "computeBinomialProbabilities", normalizationMethod);
}

And transpiled it again. It works!

vruusmann commented 2 years ago

The PMML document in question appears to be problematic. Specifically, the regression normalization method should be logit (not softmax) there. Also, since this is an XGBoost model, then all numeric data types should be float (not double).

Which software was used for producing it? It doesn't look to be the JPMML-XGBoost library, because it would be generating the following PMML markup instead:

<Segment id="2">
    <True/>
    <RegressionModel functionName="regression" normalizationMethod="logit" x-mathContext="float">
        <MiningSchema>
            <MiningField name="_target" usageType="target"/>
            <MiningField name="xgbValue"/>
        </MiningSchema>
        <RegressionTable intercept="0.0">
            <NumericPredictor name="xgbValue" coefficient="1.0"/>
        </RegressionTable>
    </RegressionModel>
</Segment>
vruusmann commented 2 years ago

Closing as "won't fix" - the JPMML-Transpiler library is not in the business of analyzing and correcting invalid PMML documents.

Please use a proper PMML producer software.