Closed denmase closed 1 year ago
java.lang.IllegalArgumentException: org.dmg.pmml.ComplexArray$SetValue
The transpiler has found a non-PMML object inside a PMML class model object, and currently does not know how to create a Java initializer code for it.
This exception is about the contents of "set-type" array elements (implementing "value is contained in set" or "value is not contained in set"-type of business logic). The default representation of sets is org.dmg.pmml.Array
class. However, for performance reasons, it is customarily pre-parsed into a org.dmg.pmml.ComplexArray
subclass.
Some of the generated PMML were successfully transpiled, however some of them were failed.
Looks like "standalone models" are transpiled successfully, while a "model plus set expression combinations" (the model uses an external DerivedField
element for encoding a categorical value into a numeric value) fail.
The JPMML-Transpiler testing suite (inside /pmml-transpiler/src/test/resources
does not seem to cover the latter scenario.
.. however it was PMML 4.3, while the rest were 4.4
The evaluation and transpilation of PMML documents is schema version agnostic.
What matters is whether the PMML document contains a DerivedField
element, which uses any of isIn
or isNotIn
built-in functions:
https://dmg.org/pmml/v4-4-1/BuiltinFunctions.html#boolean5
Hi Villu,
Thanks for swift reply and for looking into it. I can confirm your explanation, as I compared successful vs failed transpilation, the difference is indeed on the existence of SimpleSetPredicate
which uses booleanOperator. Something like:
<SimpleSetPredicate field="Occupation" booleanOperator="isIn">
<Array type="string">Clerical Executive Home Military Professional Protective Sales Support</Array>
</SimpleSetPredicate>
the difference is indeed on the existence of SimpleSetPredicate which uses booleanOperator.
The SimpleSetPredicate@booleanOperator="isIn"
construct ("predicate") should be fine, because it is covered by tests:
https://github.com/jpmml/jpmml-transpiler/blob/1.3.0/pmml-transpiler/src/test/resources/pmml/LightGBMAuditNA.pmml#L222-L224
The problem is expected to happen with Apply@function="isIn"
and Apply@function="isNotIn"
constructs ("expressions").
Hi Villu,
I don't see any occurrence of Apply
in the failed PMML. You are right though, the PMML file you mentioned is indeed transpiled successfully, although it has SimpleSetPredicate@booleanOperator="isIn"
in it.
I don't see any occurrence of Apply in the failed PMML.
That is strange.
Anyway, the culprit is the Array
element, which is typically found inside Apply
and SimpleSetPredicate
wrapper elements. Since the latter is covered by integration tests, I was assuming it must be the former that is a "marker" to look for in PMML documents.
The issue is probably easier to fix in JPMML-Transpiler code, that to keep triangulating it. The triangulation is currently only needed for building a relevant test case, in order to prevent this issue from re-happening again.
This issue was about the SimpleSetPredicate
element after all..
All elements that were inside transpileable tree models were handled successfully. However, elements inside un-transpileable tree models, or elements outside of tree models, were failing.
Thank you for the fix, will try it after this.
EDIT: I compiled the git version and updated my test application to use snapshot, but now all those PMML throw this exception.
java.lang.RuntimeException: Uncompilable source code - Erroneous tree type: org.jpmml.transpiler.TranspilerUtil
I'll wait until you make a release. Probably I'm doing it wrongly, pardon my lack of java skill. I hope you keep supporting ID10T (like me). :grinning:
Hello @vruusmann,
I was experimenting on the optimal way to implement models (whether using PMML or transpiled-PMML), then I found an error while transpiling few of PMML generated from the sample script on your blog ("Converting Scikit-Learn based LightGBM pipelines to PMML documents"). The error was:
Some of the generated PMML were successfully transpiled, however some of them were failed.
I took LightGBMAuditNA.pmml from your test data, and it's ok, however it was PMML 4.3, while the rest were 4.4.
Any hint for this error?
Thanks in advance.
Regards, Agung