Closed Yao544303 closed 6 years ago
The size of fit.pmml is 1.04G, is it too big ?
The size of the PMML file is proportional to the size of the underlying R model object. You're working with a fairly big dataset, and you're ensembling 700 decision trees - it's no surprise that the size of the PMML file approaches 1 GB.
Anyway, I've just implemented the compaction of randomForest
objects, which can be activated by specifying the compact = TRUE
argument:
library("randomForest")
library("r2pmml")
iris.rf = randomForest(Species ~ ., data = iris, ntree = 7)
r2pmml(iris.rf, "iris.pmml")
r2pmml(iris.rf, "iris-compact.pmml", compact = TRUE)
In my computer, these two PMML files compare as follows:
iris.pmml
- 430 text lines, 18063 bytes.iris-compact.pmml
- 280 text lines, 11396 bytes.
I trained a Random Froest in R with 288 features
The size of fit.pmml is 1.04G, is it too big ?