jpmml / jpmml-evaluator-spark

PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)
GNU Affero General Public License v3.0
94 stars 43 forks source link

java.lang.AbstractMethodError when calling pmmlTransformer.transform #10

Closed JuMan0603 closed 6 years ago

JuMan0603 commented 6 years ago

A java.lang.AbstractMethodError error occurred when I was testing the jpmml-evaluator-spark locally, and this error occurred when I called pmmlTransformer.transform. Here is the code:

def pmmlPredict(spark: SparkSession, pmmlModelSavePath: String, predictData: String, predictResultSavePath: String): Unit = {
    val fs = FileSystem.get(new Configuration())
    val evaluator = EvaluatorUtil.createEvaluator(new File("E:/testModel/pmmlModel/pipelinePMMLModel.xml"))
    val pmmlTransformerBuilder = new TransformerBuilder(evaluator).withTargetCols().withOutputCols().exploded(true)
    val pmmlTransformer = pmmlTransformerBuilder.build()

    val fields = new ArrayBuffer[StructField]
    val it = evaluator.getActiveFields.iterator()
    while (it.hasNext) {
      fields.:+(StructField(it.next().getName.getValue, StringType, true))
    }
    val schema = StructType(fields)

    val predictStringRDD = spark.sparkContext.textFile(predictData)

    val predictRowRDD = predictStringRDD.map(_.split(",").map(_.toDouble)).map(Row.fromSeq(_))
    val predictDF = spark.createDataFrame(predictRowRDD, schema)

    val predictResultDF = pmmlTransformer.transform(predictDF)

    predictResultDF.write.csv(predictResultSavePath)

    predictResultDF.show()
  }

The full stack trace is:

Exception in thread "main" java.lang.AbstractMethodError: org.apache.spark.ml.Transformer.transform(Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/Dataset;
    at org.apache.spark.ml.PipelineModel$$anonfun$transform$1.apply(Pipeline.scala:299)
    at org.apache.spark.ml.PipelineModel$$anonfun$transform$1.apply(Pipeline.scala:299)
    at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
    at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
    at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:186)
    at org.apache.spark.ml.PipelineModel.transform(Pipeline.scala:299)
    at com.myhexin.oryx.TestPMML$.pmmlPredict(TestPMML.scala:154)
    at com.myhexin.oryx.TestPMML$.main(TestPMML.scala:42)
    at com.myhexin.oryx.TestPMML.main(TestPMML.scala)

TestPMML.scala:154 refers to the “val predictResultDF = pmmlTransformer.transform(predictDF)” line, and I don't know why this error occurred.

vruusmann commented 6 years ago

The JPMML-Evaluator-Spark library currently exists in two configurations:

  1. The 1.0-SNAPSHOT development branch (and 1.0.0 version) - for Spark 1.5.X and 1.6.X. The argument type is DataFrame.
  2. The 1.1-SNAPSHOT development branch - for Spark 2.0 and newer. The argument type is Dataset.

Clearly, you are trying to use the 1.1-SNAPSHOT development branch on Spark 1.5.X or 1.6.X. Please switch to the 1.0-SNAPSHOT development branch.