jpmml / r2pmml

R library for converting R models to PMML
GNU Affero General Public License v3.0
73 stars 18 forks source link

Support for multinomial gbm models #5

Closed jtenini closed 8 years ago

jtenini commented 8 years ago

I can't tell if gbm models with a distribution type of "multinomial" are supported by this package. The package is creating pmml, but I can't tell how the trees are coupled to their class.

For example, in modeling an variable with five classes I get a collection of trees with each tree modeling the probability of the output being one of the five classes.

gbm.model.multi <- gbm(Y ~ .,

  • data = dfmulti,
  • distribution = "multinomial",
  • n.trees = 1000,
  • cv.folds = 0,
  • interaction.depth = 8,
  • n.minobsinnode = 1,
  • shrinkage = .01
  • )

produces a model with 5000 trees, which I can export to PMML, but I can't tell which trees are modeling which class. Should I just reduce the segment id modulo 5?

vruusmann commented 8 years ago

The r2pmml package only supports binary classification models, where the probability of the 1 class is obtained by applying the logistic transform to the "raw" value.

It is nice to know that the package doesn't crash with multinomial classification models (although it probably should). Feel free to figure out how these 5000 trees should be partitioned, and what kind of transformation needs to be applied afterwards.

I believe that R implements multinomial GBM pretty much the same as Scikit-Learn, so technically it should be doable.

vruusmann commented 8 years ago

This feature has been implemented in https://github.com/jpmml/jpmml-r/commit/1caaa0741c4d9e17e1ebef08d05be9520fc7dc84