jpmml / jpmml-sparkml

Java library and command-line application for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
267 stars 80 forks source link

error with CategoricalLabel cast to ContinuousLabel #91

Closed Mantj closed 4 years ago

Mantj commented 4 years ago

Can someone help me with this error:

Exception in thread "main" java.lang.ClassCastException: org.jpmml.converter.CategoricalLabel cannot be cast to org.jpmml.converter.ContinuousLabel
    at org.jpmml.xgboost.ObjFunction.createMiningModel(ObjFunction.java:58)
    at org.jpmml.xgboost.LinearRegression.encodeMiningModel(LinearRegression.java:30)
    at org.jpmml.xgboost.GBTree.encodeMiningModel(GBTree.java:74)
    at org.jpmml.xgboost.Learner.encodeMiningModel(Learner.java:160)
    at org.jpmml.sparkml.xgboost.BoosterUtil.encodeBooster(BoosterUtil.java:81)
    at org.jpmml.sparkml.xgboost.XGBoostClassificationModelConverter.encodeModel(XGBoostClassificationModelConverter.java:39)
    at org.jpmml.sparkml.xgboost.XGBoostClassificationModelConverter.encodeModel(XGBoostClassificationModelConverter.java:27)
    at org.jpmml.sparkml.ModelConverter.registerModel(ModelConverter.java:172)
    at org.jpmml.sparkml.PMMLBuilder.build(PMMLBuilder.java:116)

This error occurred when I save my trained PipeLineModel(vectorAssembler, XGBoostClassifier). The label of XGBoostClassifier is binary, and the version of my dependencies:

<dependency>
      <groupId>org.jpmml</groupId>
      <artifactId>jpmml-xgboost</artifactId>
      <version>1.3.11</version>
</dependency>
<dependency>
      <groupId>org.jpmml</groupId>
      <artifactId>jpmml-sparkml</artifactId>
      <version>1.4.11</version>
</dependency>
vruusmann commented 4 years ago

Are you using the JPMML-SparkML-XGBoost library?

The label of XGBoostClassifier is binary,

The label should be categorical binary for classification work, but in your case it's continuous binary.

vruusmann commented 4 years ago

TLDR: you have a mismatch between your label column type vs. XGBoost objective functions vs. XGBoostClassifier.

Most probably, you're using a regression objective function with XGBoostClassifier.