jpmml / r2pmml

R library for converting R models to PMML
GNU Affero General Public License v3.0
73 stars 18 forks source link

xgboost models to PMML for first N trees? #16

Closed sjain777 closed 7 years ago

sjain777 commented 7 years ago

Hello, Is it possible to enhance the existing r2pmml to store the first N trees from an xgboost model? This would be extremely useful when training a model using early_stopping, and then writing out the PMML for only the trees up to the best iteration. Thanks much in advance!

vruusmann commented 7 years ago

That would be trivial to implement. However, I might want to take some time to figure out a "generic parametrization/interface", which would be applicable to other boosting-type ensemble models (such as gbm or lgbm.Booster) as well.

Currently, you could write a small Java command-line application (based on the JPMML-Model library) to do the job. Its business logic would be the following:

PMML pmml = loadPMML(...);
MiningModel miningModel = (MiningModel)pmml.getModels(0);
Segmentation segmentation = miningModel.getSegmentation();
List<Segment> segments = segmentation.getSegments();
(segments.subList(n, segments.size())).clear(); // THIS
savePMML(pmml, ...);
sjain777 commented 7 years ago

thanks much for your positive response! Looking forward for the next update to include this functionality.

vruusmann commented 7 years ago

The r2pmml::xgb.Booster function now has ntreelimit (the optimal number of trees - integer) and compact (apply tree compaction? - boolean) arguments.

sjain777 commented 7 years ago

Thanks a lot!!

Sent from Yahoo Mail for iPhone

On Sunday, September 17, 2017, 2:29 PM, Villu Ruusmann notifications@github.com wrote:

The r2pmml::xgb.Booster function now has ntreelimit (the optimal number of trees - integer) and compact (apply tree compaction? - boolean) arguments.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.