jpmml / sklearn2pmml

Python library for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
685 stars 113 forks source link

Support for `BernoulliNB` estimator type #81

Open CONANLMN opened 6 years ago

CONANLMN commented 6 years ago

hello,I want to save a decision tree model in pmml format。but failed。 python 2.7.13 scikit-learn-0.19.1 sklearn2pmml 0.29

code:

#! /usr/bin/env python
#coding=utf-8
import pandas

iris_df = pandas.read_csv("Iris.csv")

from sklearn2pmml import PMMLPipeline
from sklearn.tree import DecisionTreeClassifier

iris_pipeline = PMMLPipeline([
    ("classifier", DecisionTreeClassifier())
])
iris_pipeline.fit(iris_df[iris_df.columns.difference(["Species"])], iris_df["Species"])

from sklearn2pmml import sklearn2pmml

sklearn2pmml(iris_pipeline, "DecisionTreeIris.pmml", with_repr = True)

error log:

"C:\Python27\python.exe" "F:\worksspace\python\test\Untitled 1.py"


一月 24, 2018 11:59:36 上午 org.jpmml.sklearn.Main run
信息: Parsing PKL..
一月 24, 2018 11:59:37 上午 org.jpmml.sklearn.Main run
严重: Failed to parse PKL
net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 254
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:355)
at org.jpmml.sklearn.PickleUtil$1.dispatch(PickleUtil.java:77)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:122)
at org.jpmml.sklearn.PickleUtil.unpickle(PickleUtil.java:98)
at org.jpmml.sklearn.Main.run(Main.java:104)
at org.jpmml.sklearn.Main.main(Main.java:94)

Exception in thread "main" net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 254 at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:355) at org.jpmml.sklearn.PickleUtil$1.dispatch(PickleUtil.java:77) at net.razorvine.pickle.Unpickler.load(Unpickler.java:122) at org.jpmml.sklearn.PickleUtil.unpickle(PickleUtil.java:98) at org.jpmml.sklearn.Main.run(Main.java:104) at org.jpmml.sklearn.Main.main(Main.java:94) Traceback (most recent call last): File "F:\worksspace\python\test\Untitled 1.py", line 17, in sklearn2pmml(iris_pipeline, "DecisionTreeIris.pmml", with_repr = True) File "C:\Python27\lib\site-packages\sklearn2pmml__init__.py", line 272, in sklearn2pmml raise RuntimeError("The JPMML-SkLearn conversion application has failed. The Java process should have printed more information about the failure into its standard output and/or error streams") RuntimeError: The JPMML-SkLearn conversion application has failed. The Java process should have printed more information about the failure into its standard output and/or error streams

CONANLMN commented 6 years ago

I am running on WIN7.

CONANLMN commented 6 years ago

Can run under Ubuntu 16.04

CONANLMN commented 6 years ago

But in running sklearn.naive_bayes.BernoulliNB will be wrong. error log: Standard output is empty Standard error: Jan 24, 2018 2:58:11 PM org.jpmml.sklearn.Main run INFO: Parsing PKL.. Jan 24, 2018 2:58:11 PM org.jpmml.sklearn.Main run INFO: Parsed PKL in 29 ms. Jan 24, 2018 2:58:11 PM org.jpmml.sklearn.Main run INFO: Converting.. Jan 24, 2018 2:58:11 PM org.jpmml.sklearn.Main run SEVERE: Failed to convert java.lang.IllegalArgumentException: Tuple contains an unsupported value (Python class sklearn.naive_bayes.BernoulliNB) at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43) at org.jpmml.sklearn.TupleUtil.extractElement(TupleUtil.java:48) at sklearn2pmml.PMMLPipeline.getEstimator(PMMLPipeline.java:369) at sklearn2pmml.PMMLPipeline.encodePMML(PMMLPipeline.java:85) at org.jpmml.sklearn.Main.run(Main.java:145) at org.jpmml.sklearn.Main.main(Main.java:94) Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Estimator at java.lang.Class.cast(Class.java:3369) at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41) ... 5 more

Exception in thread "main" java.lang.IllegalArgumentException: Tuple contains an unsupported value (Python class sklearn.naive_bayes.BernoulliNB) at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43) at org.jpmml.sklearn.TupleUtil.extractElement(TupleUtil.java:48) at sklearn2pmml.PMMLPipeline.getEstimator(PMMLPipeline.java:369) at sklearn2pmml.PMMLPipeline.encodePMML(PMMLPipeline.java:85) at org.jpmml.sklearn.Main.run(Main.java:145) at org.jpmml.sklearn.Main.main(Main.java:94) Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Estimator at java.lang.Class.cast(Class.java:3369) at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41) ... 5 more

vruusmann commented 6 years ago

The pickle data format is OS/platform dependent. Let me guess, the sklearn2pmml package fails on 32-bit Windows, and succeeds on 64-bit Ubuntu?

The unpickling is actually handled by the Pyrolite library. The first exception ("invalid pickle opcode: 254") should be directed to the Pyrolite project, not here.

As for the second exception ("Tuple contains an unsupported value (Python class sklearn.naive_bayes.BernoulliNB)"), then the conversion of sklearn.naive_bayes.BernoulliNB model type is currently not implemented.

CONANLMN commented 6 years ago

Hi Villu,

Thank you for your reply.

pamdla commented 2 years ago

The pickle data format is OS/platform dependent. Let me guess, the sklearn2pmml package fails on 32-bit Windows, and succeeds on 64-bit Ubuntu?

The unpickling is actually handled by the Pyrolite library. The first exception ("invalid pickle opcode: 254") should be directed to the Pyrolite project, not here.

As for the second exception ("Tuple contains an unsupported value (Python class sklearn.naive_bayes.BernoulliNB)"), then the conversion of sklearn.naive_bayes.BernoulliNB model type is currently not implemented.

Similarly, MultinomialNB and GaussianNB are neither implemented. May you guide us how to implement unsupported but existed classifiers from SKLearn?