Closed rjphofmann closed 3 years ago
The original question was asked with Apache Spark ML in mind, but the same functionality would come in handy across all JPMML-family conversion libraries (R, Scikit-Learn, etc).
At minimum, the JPMML-Converter library could provide reusable Visitor classes for transforming PMML attributes between java.lang.Number types (eg. from java.lang.Double
to java.lang.Long
or java.math.BigDecimal
).
Related discussion in the JPMML mailing list: https://groups.google.com/forum/#!topic/jpmml/-YKzSnWkN78
Alternative view - this Visitor class could be performing an "optimize the type of java.lang.Number attributes values". For example, in order to save memory, small integer values could be transformed from java.lang.Integer
(or java.lang.Long
) to java.lang.Byte
or java.langShort
.
The PMML class model should be "indifferent" to such value type changes.
Once the Visitor class is ready, it could be made default by inserting it into the org.jpmml.converter.visitors.PMMLCleanerBattery
:
https://github.com/jpmml/jpmml-converter/blob/1.4.2/src/main/java/org/jpmml/converter/ModelEncoder.java#L96-L97
Hello,
I've been actively using the PySpark2PMML package to write RF spark models into PMML documents, and was just noticing that sometimes I get scientific notation in the output:
Is there a way to control whether or not scientific notation is used in the output? I'd prefer that it isn't used, as my C++ parser isn't written to accept it. Thanks!
Patrick Hofmann