jpmml / jpmml-evaluator

Java Evaluator API for PMML
GNU Affero General Public License v3.0
895 stars 255 forks source link

Model verification not enforcing field validity criteria? #223

Closed ZahidFKhan closed 3 years ago

ZahidFKhan commented 3 years ago

I have the below field in my data dictionary

<DataField dataType="double" name="sepal_length" optype="continuous">
            <Interval closure="closedClosed" leftMargin="4.3" rightMargin="7.9"/>
</DataField>

So, the value of sepal_length should be within 4.3 to 7.9

but in ModelVerification if I assign value less or more than 4.3 its is allowing it, which to the best of my knowledge should not happen.

  <ModelVerification>
            <VerificationFields>
                <VerificationField field="sepal_length"/>
                   .
                   .
                   .
                   .
                <VerificationField field="class"/>
            </VerificationFields>
            <InlineTable>
                <row>
                    <sepal_length>1</sepal_length>
                   <class>iris</class>
               </row>
              </InlineTable>
vruusmann commented 3 years ago

but in ModelVerification if I assign value less or more than 4.3 its is allowing it, which to the best of my knowledge should not happen.

It should not happen, and it does not happen.

Here's a proof:

  1. Using the JPMML-SkLearn command-line application to generate a PMML document with model verification element:

    $ java -jar target/jpmml-sklearn-executable-1.6-SNAPSHOT.jar --pkl-input src/test/resources/pkl/DecisionTreeIris.pkl --pmml-output DecisionTreeIris.pmml
  2. Changing the value of the first Sepal.Length field to 0.0 (which is invalid according to the MiningSchema element):

    <row>
    <data:Sepal.Length>0.0</data:Sepal.Length>
    <data:Sepal.Width>2.8</data:Sepal.Width>
    <data:Petal.Length>4.5</data:Petal.Length>
    <data:Petal.Width>1.3</data:Petal.Width>
    <data:probability_setosa>0.0</data:probability_setosa>
    <data:probability_versicolor>1.0</data:probability_versicolor>
    <data:probability_virginica>0.0</data:probability_virginica>
    </row>
  3. Running the modified example using the JPMML-Evaluator command-line application:

    $ java -jar ../jpmml-evaluator/pmml-evaluator-example/target/pmml-evaluator-example-executable-1.6-SNAPSHOT.jar --model DecisionTreeIris.pmml --input src/test/resources/csv/Iris.csv --output DecisionTreeIris.csv
  4. This raises an InvalidResultException, just as promised/expected:

    Exception in thread "main" org.jpmml.evaluator.InvalidResultException (at or around line 30 of the PMML document): Field "Sepal.Length" cannot accept user input value 0.0
        at org.jpmml.evaluator.InputFieldUtil.performInvalidValueTreatment(InputFieldUtil.java:235)
        at org.jpmml.evaluator.InputFieldUtil.prepareScalarInputValue(InputFieldUtil.java:151)
        at org.jpmml.evaluator.InputFieldUtil.prepareInputValue(InputFieldUtil.java:111)
        at org.jpmml.evaluator.InputField.prepare(InputField.java:73)
        at org.jpmml.evaluator.ModelEvaluator.verify(ModelEvaluator.java:197)
        at org.jpmml.evaluator.ModelEvaluator.verify(ModelEvaluator.java:59)
        at org.jpmml.evaluator.example.EvaluationExample.execute(EvaluationExample.java:353)
        at org.jpmml.evaluator.example.Example.execute(Example.java:96)
        at org.jpmml.evaluator.example.EvaluationExample.main(EvaluationExample.java:262)

QED.

So, whatever the problem, it must be in your own code.

vruusmann commented 3 years ago

@Zahidkhan-xee This is another incorrect issue report from you. Please be careful with your work, you're literally wasting my time.