jpmml / jpmml-converter

Java library for authoring PMML
GNU Affero General Public License v3.0
15 stars 4 forks source link

SVM's classificationMethod is always "OneAgainstOne" #12

Closed rymmonlu closed 5 years ago

rymmonlu commented 5 years ago

Not sure this is the right place to ask this question. When i try to export a SVC model (a pipeline) by sklearn2Pmml, i always get a pmml with classificationMethod="OneAgainstOne", though i explicitly specifying decision_function_shape as "ovr" in python. The pipeline is defined as following,

# create pipeline
 model_pipeline = PMMLPipeline([
         ("mapper", DataFrameMapper([
         (feat_names, [ContinuousDomain(with_data=False)])
         ])),
         ("SVC", SVC(probability=True, random_state=2018, decision_function_shape="ovr"))
     ])

After i checked the source code in converter/support_vector_machine/LibSVMUtil.java:116, i found the SupportVectorMachineModel.ClassificationMethod is initialized as ONE_AGAINST_ONE and without any reseting by the decision_function shape set in python.

Please correct me is anything i missed. Many thanks for your help. You made a great project!

vruusmann commented 5 years ago

@rymmonlu What exactly is the problem? Is the generated PMML document making incorrect predictions?

What difference does it make how a Python data structure is translated to a PMML data structure, for as long as the translation preserves the properties of the original prediction algorithm?

vruusmann commented 5 years ago

I've now run my SVC integration tests with decision_function_shape="ovr" and decision_function_shape="ovo", and verified that the current JPMML-Converter/JPMML-SkLearn/SkLearn2PMML software stack is generating PMML documents that make correct predictions when evaluated with the JPMML-Evaluator library.

The JPMML family of libraries chooses to use ONE_AGAINST_ONE encoding scheme for LibSVM-based SVM models, because it leads to the most compact/readable PMML representation.

rymmonlu commented 5 years ago

Thanks for your confirm about the "ONE_AGAINST_ONE". BTW, my prediction result looks ok just as you tried.