jpmml / jpmml-evaluator

Java Evaluator API for PMML
GNU Affero General Public License v3.0
888 stars 256 forks source link

How to work with an association rules model (`AssociationModel` element)? #250

Open AayushSameerShah opened 1 year ago

AayushSameerShah commented 1 year ago

I am trying to run the Association model in my workflow referring to this example: Example Link

I am using jpmml-evaluator version 1.6.3 and it seems like the EvaluatorUtil class no longer have a .prepare method as shown in the example: Example link 54th line

My code goes like:

// Just getting the PMML object
String path = "ShoppingAssocRules.xml";

File initialFile = new File(path);
InputStream targetStream = new FileInputStream(initialFile);
PMML pmml = PMMLUtil.unmarshal(targetStream);
System.out.println("[PMML LOADED]");

AssociationModelEvaluator associationModelEvaluator = new AssociationModelEvaluator(pmml);
List<InputField> activeFields = associationModelEvaluator.getActiveFields();

// Make sure that all user supplied item values conform to the data schema
FieldValue activeValue = EvaluatorUtil.prepare(associationModelEvaluator, activeField, "Bakery, Milk");

...

My IDE shows an error under EvaluatorUtil.prepare(...).


I would really appreciate your help Villu sir. Thank you.

vruusmann commented 1 year ago

I am trying to run the Association model in my workflow referring to this example

The JPMML-Example project was last updated in October 2013, which is more than nine years ago. You can't realistically expect this code to work without changes.

This project has been marked as "(public) Archived".

It seems like the EvaluatorUtil class no longer have a .prepare method.

The EvaluatorUtil#prepare(...) static utility method has been refactored into org.jpmml.evaluator.InputField#prepare(Object) instance method.

The idea behind this change is to keep input field validation "local and well encapsulated".

You can take an InputField object, and send it to other places of your Java application to perform standalone data validation. For example, if you have the "iris" dataset loaded into memory, then you can take the InputField object corresponding to the "Sepal.Length" input field, and perform data pre-validation (without doing any data record evaluation at all).

// Make sure that all user supplied item values conform to the data schema
FieldValue activeValue = EvaluatorUtil.prepare(associationModelEvaluator, activeField, "Bakery, Milk");

This should become:

FieldValue preparedValue = activeField.prepare(Arrays.asList("Bakery", "Milk"));

Please note that if the input field expects a collection-type value, then it should be passed a java.util.Collection object. A comma-delimited String is just a "workaround" to make the data input convenient for command-line applications - the JCommander library would parse this comma-delimited String into a java.util.List automatically.

vruusmann commented 1 year ago

Anyway, keeping this issue open as a reminder to write a small blog post about working with association models.

Association models are different from ordinary supervised learning models (eg. classification, regression), as well as from ordinary unsupervised learning models (eg. clustering). They match pretty nicely with JPMML-Evaluator API conventions, but there are some gotchas.

AayushSameerShah commented 1 year ago

Thanks for the direction. I have made the changes accordingly but still I am having some "object type" related problems. Apologies for asking too much.

FieldValue activeValue = activeField.prepare(Arrays.asList("Bakery", "Milk")); // as per suggested changes

// Making arguments
Map<InputField, FieldValue> arguments = Collections.singletonMap(activeField, activeValue);        

// From here, I can't proceed forward
Map<FieldName, ?> result = associationModelEvaluator.evaluate(arguments);

The very last line errors: "Required Map<String, ?> received Map<InputField, FieldValue>".

I had to make arguments object Map<InputField, FieldValue> type, otherwise, it wasn't accepting the code. I can understand that the code in the example is old and I can't expect it to be working, but I still need to refer from somewhere, right?

Will that be possible to have some more code so that I can make it clear how the predictions (or recommendations) are made? How the AssociationModelEvaluator will let me have the confidence for a given new pair of values (bakery and milk)?

In your own time, Thanks & appreciate your help.

vruusmann commented 1 year ago

The very last line errors: "Required Map<String, ?> received Map<InputField, FieldValue>".

If you get stuck with compiler errors (when dealing with such high-level APIs such as Evaluator#evaluate(Map)), then you will probably find a correct code example in the project's README file.

The main API change between JPMML-Evaluator 1.5.X and 1.6.X is that now all Map keys are field names as java.lang.String objects.

Therefore, the correct code snip for you would be:

Map<String, ?> arguments = Collections.singletonMap(activeField.getName(), activeValue); 
vruusmann commented 1 year ago

Will that be possible to have some more code so that I can make it clear how the predictions (or recommendations) are made?

That will be explained in the upcoming blog post.

Unfortunately, the timeline for it is unclear (could be days, could be weeks or even months away).

How the AssociationModelEvaluator will let me have the confidence for a given new pair of values (bakery and milk)?

The target value is model type-specific.

For the association rules model type (the AssociationModel element), the target value will be a subclass of org.jpmml.evaluator.association.Association class: https://github.com/jpmml/jpmml-evaluator/blob/1.6.3/pmml-evaluator/src/main/java/org/jpmml/evaluator/association/Association.java

As you can see, the "public API surface" of this class is defined by org.jpmml.evaluator.HasEntityRegistry and org.jpmml.evaluator.HasRuleValues marker interfaces.

Cast your target value into those interface types, and collect whatever prediction details you need:

Map<String, ?> results = associationModelEvaluator.evaluate(arguments);

// Association rules model is "unsupervised" in a sense that it doesn't provide a target field definition.
// The target value is mapped to the `null` key in such a case
Object targetValue = results.get(null);

org.jpmml.evaluator.HasRuleValues association = (org.jpmml.evaluator.HasRuleValues)targetValue;

List<AssociationRule> recommendations = association.getRuleValues(OutputField.Algorithm.RECOMMENDATION);
System.out.println(recommendations);

will let me have the confidence for a given new pair of values

Available as org.dmg.pmml.association.AssociationRule#getConfidence()

AayushSameerShah commented 1 year ago

Hello VR, I have made my tries, but so far I am facing the error when running the Map<String, ?> results = associationModelEvaluator.evaluate(arguments); line of code.

To give you the context, here's the code:

// The path to pmml (this is the example file from official site given as link below this snippet)
String path = "AssociationModel.pmml";

File initialFile = new File(path);      
InputStream targetStream = new FileInputStream(initialFile);
PMML pmml = PMMLUtil.unmarshal(targetStream);

AssociationModelEvaluator associationModelEvaluator = new AssociationModelEvaluator(pmml);
List<InputField> activeFields = associationModelEvaluator.getActiveFields();

// The Association rules model must contain exactly one MiningField whose type is "active"
if(activeFields.size() != 1){
System.out.println("More mining fields found: " + activeFields.size());
    throw new IllegalArgumentException();
}

InputField activeField = activeFields.get(0);

// Make sure that all user supplied item values conform to the data schema
FieldValue activeValue = activeField.prepare(Arrays.asList("Cracker", "Coke"));

Map<String, ?> arguments = Collections.singletonMap(activeField.getName(), activeValue);
Map<String, ?> result = associationModelEvaluator.evaluate(arguments);

The last line of code throws:

Exception in thread "main" java.lang.IllegalStateException
    at org.jpmml.evaluator.ModelEvaluator.ensureConfiguration(ModelEvaluator.java:564)
    at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:272)
    at pmml_checks.PmmlImport_OtherTypes.main(PmmlAssociationTrials.java:98)

I am using jpmml-evaluator version 1.6.3. PMML file: AssociationModel.txt

Now obviously I am getting something wrong and apologies to ask it, because as you've said about writing a small blog on it, I am not sure how to proceed with Associations without a blog's guidance and example.

Since Association models are very different from the other models, I am even unsure whether TransformerBuilder will be helpful or not here.

As you've mentioned in issue#9 about what can we "expect" (3 expectations) from an association file rather than generic Supervised models (to make predictions), I am willing to get the recommendations based on the given item.

I know, I am more or less asking to write a thorough example, but if possible can you provide some snippet so that I can proceed ahead?

Thank you.

vruusmann commented 1 year ago

The last line of code throws:

Exception in thread "main" java.lang.IllegalStateException
  at org.jpmml.evaluator.ModelEvaluator.ensureConfiguration(ModelEvaluator.java:564)

Pay attention to the type of the exception being thrown - it is plain java,lang.IllegalStateException, not some fancy org.jpmml.evaluator.EvaluationException subclass.

This exception signals that the model evaluator object is currently in improper state (there's some configuration information missing), and therefore cannot be used for prediction.

The root cause for this is that you have constructed a model evaluator object by directly invoking the AssociationModelEvaluator(PMML) constructor directly. However, you should have obtained it via the standard way, by using the ModelEvaluatorBuilder(pmml).build() pattern.

List<InputField> activeFields = associationModelEvaluator.getActiveFields();

Your PMML model has two input fields - one so-called group field ("transaction") and one active field ("item").

You should pay attention to both of them.

Therefore, replace Evaluator#getActiveFields() with Evaluator#getInputFields(), and assert that you have two input field declarations there.

vruusmann commented 1 year ago

if possible can you provide some snippet so that I can proceed ahead?

Handling group fields: https://github.com/jpmml/jpmml-evaluator/blob/1.6.4/pmml-evaluator-example/src/main/java/org/jpmml/evaluator/example/EvaluationExample.java#L353-L354 https://github.com/jpmml/jpmml-evaluator/blob/1.6.4/pmml-evaluator-example/src/main/java/org/jpmml/evaluator/example/EvaluationExample.java#L356-L360 https://github.com/jpmml/jpmml-evaluator/blob/1.6.4/pmml-evaluator-example/src/main/java/org/jpmml/evaluator/example/EvaluationExample.java#L376-L380

The org.jpmml.evaluator.example.EvaluationExample command-line application should be able to score your PMML document with a CSV input file.

This CSV file should have two data columns - one for transaction identifiers (labelled "transaction") and another one for grocery items (labelled "item").