jpmml / r2pmml

R library for converting R models to PMML
GNU Affero General Public License v3.0
73 stars 18 forks source link

Error in conversion of 'IsolationForest::iForest' objects to PMML #10

Closed jaroslav-kuchar closed 7 years ago

jaroslav-kuchar commented 7 years ago

I was not able to properly use a conversion of 'IsolationForest::iForest' objects to PMML. For the following code:

library("IsolationForest")
library("r2pmml")
data("iris")
iForest <- IsolationForest::IsolationTrees(iris)
r2pmml(iForest, "iForest.pmml")

I got errors:

SEVERE: Failed to convert
java.lang.IllegalArgumentException
    at org.jpmml.rexp.IForestConverter.encodeFeatures(IForestConverter.java:71)
    at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:78)
    at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:74)
    at org.jpmml.rexp.Main.run(Main.java:149)
    at org.jpmml.rexp.Main.main(Main.java:97)

Exception in thread "main" java.lang.IllegalArgumentException
    at org.jpmml.rexp.IForestConverter.encodeFeatures(IForestConverter.java:71)
    at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:78)
    at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:74)
    at org.jpmml.rexp.Main.run(Main.java:149)
    at org.jpmml.rexp.Main.main(Main.java:97)
Error in .convert(tempfile, file, ...) : 1

Java 1.8 R 3.3.0

vruusmann commented 7 years ago

The IsolationForest algorithm doesn't support categorical (ie. R's factor data type) variables. If you want to train an isolation forest model for the Iris dataset, then you can use only the first four columns:

data(iris)
iris_vars = iris[, -ncol(iris)]

iForest = IsolationForest::IsolationForest(iris_vars)

Admittedly, the above IllegalArgumentException should provide an appropriate message.

vruusmann commented 7 years ago

Also, this is the R script that is used for generating JPMML-R integration tests: https://github.com/jpmml/jpmml-r/blob/master/src/test/R/IsolationForest.R

jaroslav-kuchar commented 7 years ago

Thanks for your prompt reply. I did not know about the issues with categorical variables. It works!