jpmml / jpmml-evaluator

Java Evaluator API for PMML
GNU Affero General Public License v3.0
895 stars 255 forks source link

User defined Java function for binomial sampling #221

Closed FrieseWoudloper closed 3 years ago

FrieseWoudloper commented 3 years ago

We have a binary classification model for predicting high or low risk. We have an intervention that is only applied to high risk cases. In order to collect data for retraining the model, we would also like to sample the low risk cases for applying the intervention. The outputField section of our PMML should contain probability, classification (low/high risk) and action (yes/no). When classification = high risk then always action = yes. When classification = low risk we want to 'flip a coin' with a certain probability (binomial distribution) to determine whether or not action = yes. Could we create a user defined function in Java and call that function in the PMML? Would that work or is there a better way?

FrieseWoudloper commented 3 years ago

Example of Output section to clarify my question:

<Output>
    <OutputField name="raw_result" optype="continuous" dataType="double" feature="probability" value="High"/>
    <OutputField name="classification" optype="categorical" dataType="string" feature="decision">
        <Decisions businessProblem="Risk profile">
            <Decision value="High" description="High risk"/>
            <Decision value="Low" description="Low risk"/>
        </Decisions>
        <Apply function="if">
            <Apply function="greaterOrEqual">
                <FieldRef field="raw_result"/>
                <Constant>0.70</Constant>
            </Apply>
            <!--THEN-->
            <Constant>High</Constant>
            <!-- ELSE-->
            <Constant>Low</Constant>
        </Apply>
    </OutputField>
    <OutputField name="action" optype="categorical" dataType="string" feature="decision">
        <Decisions businessProblem="Should we apply the intervention?">
            <Decision value="Yes" description="Apply intervention"/>
            <Decision value="No" description="Don't apply intervention"/>
        </Decisions>
        <Apply function="if">
            <Apply function="equal">
                <FieldRef field="classification"/>
                <Constant>High</Constant>
            </Apply>
            <!--THEN-->
            <Constant>Yes</Constant>
            <!--ELSE-->
            <!--Flip a coin to determine whether or not to apply the intervention -->
            <!--Binominal distribution with a given probability of success -->
            <!--Not yet implemented-->
        </Apply>
    </OutputField>
</Output>
vruusmann commented 3 years ago

Could we create a user defined function in Java and call that function in the PMML?

Java-backed functions is a vendor extension. So, this part of your PMML documents will not be portable across PMML engines.

When using the JPMML-Evaluator library, then you'd need to do the following:

First, create a Java class that implements the org.jpmml.evaluator.Function interface: https://github.com/jpmml/jpmml-evaluator/blob/1.5.15/pmml-evaluator/src/main/java/org/jpmml/evaluator/Function.java

In most cases you'd be subclassing the o.j.e.functions.AbstractFunction abstract base class: https://github.com/jpmml/jpmml-evaluator/tree/1.5.15/pmml-evaluator/src/main/java/org/jpmml/evaluator/functions

Give your custom function class a unique and meaningful class name. For example, com.mycompany.pmml.CoinFlipFunction.

Then, compile this class and add it to your Java application's classpath. When encountering a non-standard Apply@function attribute value, then the JPMML-Evaluator library tries to look up a Java class with this name; if a class is found, and this class implements the o.j.e.Function interface, then it will be invoked: https://github.com/jpmml/jpmml-evaluator/blob/1.5.15/pmml-evaluator/src/main/java/org/jpmml/evaluator/FunctionRegistry.java#L49-L94

If your Java-backed function has "singleton" characteristics, then you can create this function instance explicitly (eg. during the initialization of your Java application), and register it with the JPMML-Evaluator library using the o.j.e.FunctionRegistry#putFunction(Function) method.

Finally, reference the Java class by its full name and/or symbolic name (depending on the discovery/registration path taken above):

<Apply function="com.mycompany.pmml.CoinFlipFunction">
    <!-- Further arguments to the custom function -->
    <Constant>0.45</Constant>
</Apply>