jpmml / jpmml-sklearn

Java library and command-line application for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
531 stars 117 forks source link

Failed to use numpy.log function doing math operation #151

Closed HelloLadsAndGents closed 3 years ago

HelloLadsAndGents commented 3 years ago
mapper = DataFrameMapper(
    [(["Sepal.Length"], [CategoricalDomain(),OrdinalEncoder(categories=categories)]),
    (["Sepal.Length"], [ReplaceTransformer("-",""),ReplaceTransformer("以上","0"),ExpressionTransformer("X[0] + X[0]"),ExpressionTransformer("numpy.log(X[0])"),StandardScaler()])
     ]
)

classifier = LGBMClassifier(n_estimators=5, learning_rate=0.1, num_leaves=10, max_depth=2, n_jobs=20)

pipeline = PMMLPipeline([
    ("mapper", mapper),
    ("classifier", classifier)],predict_proba_transformer =ExpressionTransformer("X[0]*200+102")
)
pipeline.fit(df[cat_columns], df[label_column])

thanks for telling me this predict_proba_transformer function

but some problems happens when i'm using: ExpressionTransformer("numpy.log(X[0])")

logs like this: TypeError: ['Sepal.Length']: ufunc 'log' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' wrong_logs.txt

but the ExpressionTransformer says it support: sklearn2pmml.preprocessing.ExpressionTransformer Ternary conditional expression if else . ..... Numpy universal functions.

can you tell me how to use it? thanks~

vruusmann commented 3 years ago

The ReplaceTransform transformation type returns a string value. It does not make sense to apply the logarithm function to a string.