Closed XScarlett closed 4 years ago
Right now you're invoking PMMLBuilder#buildFile(...)
which saves the PMML class model object into a file in the local filesystem.
If you invoke PMMLBuilder#build()
, then you'll obtain a live org.dmg.pmml.PMML
object instance that you can modify as you see fit. I'd recommend using the Visitor API of the JPMML-Model library for implementing all the necessary transformations and rearrangements.
For example, changing the TreeModel@missingValueStrategy
attribute value:
PMMLBuilder pmmlBuilder = ...
PMML pmml = pmmlBuilder.build();
Visitor mvsCustomizer = new AbstractVisitor(){
@Override
public VisitorAction visit(TreeModel treeModel){
treeModel.setMissingValueStrategy(TreeModel.MissingValueStrategy.LAST_PREDICTION);
return super.visit(treeModel);
}
};
mvsCustomizer.applyTo(pmml);
It's possible to compute record counts for "parent" tree levels by summing the record counts of their "child" tree levels.
There's a Visitor API example available in another demo project: https://github.com/vruusmann/rf_feature_impact/blob/master/src/main/java/feature_impact/visitors/ScoreDistributionGenerator.java
Leaving this issue open-ish - a reminder that perhaps there's a way to generalize and implement all this functionality in the form of JPMML-SparkML conversion options.
Thank you so much!!!
when convert spark ml pipelinemodel to pmml, i want to set missingValueStrategy as lastPrediction and set ScoreDistribution and score in every node(not just leaf node), how can i do this in java?
The following picture is my code and part of pmml.xml result: