jpmml / pyspark2pmml

Python library for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
95 stars 25 forks source link

Exception while running pyspark2pmml #22

Closed avisheknag17 closed 5 years ago

avisheknag17 commented 5 years ago

Hi,

My spark version is spark-2.2.1 I am running pyspark2pmml using the following command

./spark-submit --jars /opt/spark-2.2.1-master/bin/jpmml-sparkml-executable-1.3.14.jar enclosures_pyspark_model.py

My code looks like this :

from pyspark2pmml import PMMLBuilder

pmmlBuilder = PMMLBuilder(sc, train_df, pl) \
    .putOption(classifier, "compact", True)

pmmlBuilder.buildFile("enclosures_pyspark_model.pmml")

While running it is throwing exception

Traceback (most recent call last): File "/opt/spark-2.2.1-master/bin/enclosures_pyspark_model.py", line 125, in build_and_train_model() File "/opt/spark-2.2.1-master/bin/enclosures_pyspark_model.py", line 118, in build_and_train_model pmmlBuilder = PMMLBuilder(sc, train_df, pl) \ File "/Users/avnag/Library/Python/2.7/lib/python/site-packages/pyspark2pmml/init.py", line 15, in init javaPmmlBuilder = javaPmmlBuilderClass(javaSchema, javaPipelineModel) File "/opt/spark-2.2.1-master/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1401, in call File "/opt/spark-2.2.1-master/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/opt/spark-2.2.1-master/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 323, in get_return_value py4j.protocol.Py4JError: An error occurred while calling None.org.jpmml.sparkml.PMMLBuilder. Trace: py4j.Py4JException: Constructor org.jpmml.sparkml.PMMLBuilder([class org.apache.spark.sql.types.StructType, class org.apache.spark.ml.Pipeline]) does not exist at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179) at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196) at py4j.Gateway.invoke(Gateway.java:235) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748)

=============================================

Is there anything else I need to do? Why is it not able to find the constructor ? I guess I am using right version

avisheknag17 commented 5 years ago

Sorry, I solved it.

I was passing the Pipeline instance instead of PipelineModel.

aliware50 commented 4 years ago

Could you put you code of changes?i have same questions. Thanks