jpmml / pyspark2pmml

Python library for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
95 stars 25 forks source link

Pyspark "Multinomial logistic regression" InterceptVector Problem #7

Closed sbourzai closed 7 years ago

sbourzai commented 7 years ago

Hi Sir, I Try to get my PMML file for a Multinomial logistic Regression with Iris data, but i think that the function toPMMLBytes take not in consideration the InterceptVector Ithink that the problem is at this level : ----> 8 javaPipelineModel = pipelineModel._to_java()

I get this message : Py4JJavaError: An error occurred while calling z:org.jpmml.sparkml.ConverterUtil.toPMMLByteArray. : org.apache.spark.SparkException: Multinomial models contain a vector of intercepts, use interceptVector instead. at org.apache.spark.ml.classification.LogisticRegressionModel.intercept(LogisticRegression.scala:731)

NB : I was able to get my pmml model with the Titanic data ( Binary logistic Regression )

Have any Idea ?

Thanks, Kind Regards

vruusmann commented 7 years ago

Multinomial logistic regression was introduced in Apache Spark 2.1.X.

You must use JPMML-SparkML version that is compatible with your Spache Spark version, see the version compatibility matrix in the README file: https://github.com/jpmml/jpmml-sparkml#library

sbourzai commented 7 years ago

Thanks @vruusmann , I'm using Pyspark, the build don't produce the egg file !!!