jpmml / jpmml-sparkml

Java library and command-line application for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
267 stars 80 forks source link

Add support for `LinearSVC` model type #55

Closed xixici closed 5 years ago

xixici commented 5 years ago

Getting the following exception -

java.lang.IllegalArgumentException: Transformer class org.spache.spark.ml.classification.LinearSVCModel is not supported

java.lang.illegalargumentexception: Transformer class org.spache.spark.ml.classification.OneVsRestModel is not supported.

I am trying to convert an LinearSVCModel into PMML. Is it not supported? Is there anyway to solve this issue?

vruusmann commented 5 years ago

Class org.apache.spark.ml.classification.LinearSVC was added in Apache Spark 2.2.X: https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/ml/classification/LinearSVC.html

Its business logic is very simple:

private val margin: Vector => Double = (features) => {
  BLAS.dot(features, coefficients) + intercept
}

Should be representable as a (non-probabilistic-) RegressionModel element.

xixici commented 5 years ago

Thanks, @vruusmann . But I cannot get your means here. What should I do to solve my problem? And, I have no idea now. Through my research 1: SVM model is supported in jpmml-evaluator, jpmml-sklearn... However, jpmml-sparkml is not supported. 2: Spark-Mllib based RDD support SVM well. Besides, do you have plan to solve these? Please help me and point one way to me. Thanks your good works.

vruusmann commented 5 years ago

What should I do to solve my problem?

Two options:

  1. Write the LinearSVC converter yourself
  2. Wait a little until I do it.

SVM model is supported in jpmml-evaluator, jpmml-sklearn... However, jpmml-sparkml is not supported.

From the PMML perspective, LinearSVC looks more like a RegressionModel element (http://dmg.org/pmml/v4-3/Regression.html), not a SupportVectorMachineModel element (http://dmg.org/pmml/v4-3/SupportVectorMachine.html). This is a good thing, because it means that it's really easy to build a converter for it - maybe ten to fifteen lines of Java code.

xixici commented 5 years ago

I decide to choose the option of 2. It's too happy to hear it. Waiting for your commit. Thanks very much.

vruusmann commented 5 years ago

I decide to choose the option of 2

I'm currently working on other projects, so it could take up to two weeks before a new version of JPMML-SparkML is developed and released