jpmml / jpmml-sklearn

Java library and command-line application for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
531 stars 117 forks source link

Dealing predicted res in the same pipline ? #150

Closed HelloLadsAndGents closed 3 years ago

HelloLadsAndGents commented 3 years ago

question:

df = pandas.read_csv("Iris2.csv")
cat_columns = ["Sepal.Length"]
label_column = "Species"
categories = [list(df["Sepal.Length"].unique())]
categories =[[ '18-30','31-40', '41-50', '51-60', '60以上']]

mapper = DataFrameMapper(
    [(["Sepal.Length"], [CategoricalDomain(),OrdinalEncoder(categories=categories)]),
    (["Sepal.Length"], [ReplaceTransformer("-",""),ReplaceTransformer("以上","0"),ExpressionTransformer("X[0] + X[0]"),StandardScaler()])
     ]
)
classifier = LGBMClassifier(n_estimators=5, learning_rate=0.1, num_leaves=10, max_depth=2, n_jobs=20)

pipeline = PMMLPipeline([
    ("mapper", mapper),
    ("classifier", classifier)
])
pipeline.fit(df[cat_columns], df[label_column])

sklearn2pmml(pipeline, "lgbm.pmml")

how can i add another step to deal with the result after ("classifier", classifier) is there something like :

pipeline = PMMLPipeline([
    ("mapper", mapper),
    ("classifier", classifier),
        ("reducer",reducer)
])

that i can use to deal with the predict res like : 0.131415 because i want to transform it to something like : 100 * 0.131415 + 324

if there is ,how can i get the attribute name and how can i use it?

thanks

vruusmann commented 3 years ago

The official Scikit-Learn pipeline API won't let you have any transformer steps after the final estimator step.

The sklearn2pmml.pipeline.PMMLPipeline class extends the base sklearn.pipeline.Pipeline class with predict_transformer, predict_proba_transformer and apply_transformer attributes: https://github.com/jpmml/sklearn2pmml/blob/0.61.0/sklearn2pmml/pipeline/__init__.py#L47-L51

See JPMML-SkLearn intergration tests for actual code examples: https://github.com/jpmml/jpmml-sklearn/blob/1.6.4/src/test/resources/main.py

For example: https://github.com/jpmml/jpmml-sklearn/blob/1.6.4/src/test/resources/main.py#L323

HelloLadsAndGents commented 3 years ago

Cool, thanks a lot