jpmml / sklearn2pmml

Python library for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
685 stars 113 forks source link

Support for Python power operator `**` #390

Closed jsgarcesc closed 1 year ago

jsgarcesc commented 1 year ago

I am trying to add an exponential function 'X[0]**X[1]' with ExpressionTransformer. Where X[0] and X[1] are columns calculated in previous steps, this transformer is passed by a python pipeline to generate a PMML using sklearn2pmml, but a got the message:

Standard output is empty Standard error: Exception in thread "main" java.lang.IllegalArgumentException: Python expression 'X[0]*X[1]' is either invalid or not supported at org.jpmml.python.ExpressionTranslator.translateExpression(ExpressionTranslator.java:86) at org.jpmml.python.ExpressionTranslator.translateExpression(ExpressionTranslator.java:75) at sklearn2pmml.util.EvaluatableUtil.translateExpression(EvaluatableUtil.java:73) at sklearn2pmml.util.EvaluatableUtil.translateExpression(EvaluatableUtil.java:56) at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:73) at sklearn.Transformer.encode(Transformer.java:72) at sklearn.compose.ColumnTransformer.encodeFeatures(ColumnTransformer.java:59) at sklearn.Transformer.encode(Transformer.java:72) at sklearn.Composite.encodeFeatures(Composite.java:119) at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:153) at com.sklearn2pmml.Main.run(Main.java:91) at com.sklearn2pmml.Main.main(Main.java:66) Caused by: org.jpmml.python.ParseException: Encountered unexpected token: "" "*" at line 1, column 6. Was expecting one of: "(" "+" "-" "False" "None" "True" "\"\"\""

I also try the "PowerFunctionTransformer", but it only receives integers. I appreciate if you can help me find any solution.
vruusmann commented 1 year ago

I am trying to add an exponential function 'X[0]**X[1]' with ExpressionTransformer.

Looks like the Python power operator ** is currently not supported.

Should update the expression parser component.

I appreciate if you can help me find any solution.

Use a named function instead:

  1. math.pow(X[0], X[1])
  2. numpy.power(X[0], X[1])