jpmml / jpmml-sklearn

Java library and command-line application for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
531 stars 117 forks source link

How to get encoding feature from GBDT model #126

Closed wangjie5540 closed 4 years ago

wangjie5540 commented 4 years ago

Hi, dear all: I get a GBDT model(Pmml file), but how do I transform the input features to encoding features using jpmml. As I was using sklearn with the "apply" function, I can got “Sparse Features” easily.

ps: I just want to use the Sparse Features in LR trainning and predicting.

大家好: 我现在具备gdbt的pmml模型,但是怎样使用这个pmml模型,把原始的输入特征转换为 高维编码特征呢? 比如python的sklearn中的apply函数,可以轻易地获取高维特征。 ps:我的目的是,得到高维特征,之后进行LR(逻辑回归)的训练和预测

vruusmann commented 4 years ago

but how do I transform the input features to encoding features using jpmml.

The JPMML-Evaluator library does all feature engineering automatically. You can pass a map of string values to the Evaluator#evaluate(Map) method, and they will be used as-is.

I just want to use the Sparse Features in LR trainning and predicting.

In (J)PMML there is no concept of sparse or dense matrices. Everything is handled as scalar values.

For starters, please see the "Usage" section of the README file: https://github.com/jpmml/jpmml-sklearn#usage