import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.externals import joblib
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import sklearn2pmml
iris_data = load_iris()
X, y = iris_data.data, iris_data.target
knn = KNeighborsClassifier(n_neighbors=np.int64(5))
knn.fit(X, y)
pipeline = PMMLPipeline([("knn",knn)])
pipeline.active_fields = np.array(load_iris().feature_names)
joblib.dump(pipeline, 'pipeline.pkl.z', compress=9)
sklearn2pmml(pipeline, "KNNFit_py.pmml", debug = 'True')
Expectation:
PMML file is saved successfully without any error
Actual Output:
Error
RuntimeError: The JPMML-SkLearn conversion application has failed.
The Java executable should have printed more information about the failure into
its standard output and/or standard error streams
When running the java version:
Apr 19, 2018 12:30:22 PM org.jpmml.sklearn.Main run
SEVERE: Failed to convert
java.lang.ClassCastException: numpy.core.Scalar cannot be cast to java.lang.Number
at sklearn.neighbors.KNeighborsClassifier.getNumberOfNeighbors(KNeighborsClassifier.java:70)
at sklearn.neighbors.KNeighborsUtil.encodeNeighbors(KNeighborsUtil.java:130)
at sklearn.neighbors.KNeighborsClassifier.encodeModel(KNeighborsClassifier.java:57)
at sklearn.neighbors.KNeighborsClassifier.encodeModel(KNeighborsClassifier.java:32)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:161)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
Workaround:
Replace the np.int64 value with simple int in construction
knn = KNeighborsClassifier(n_neighbors=5)
Issue:
I think that jpmml-sklearn should handle numpy scalars to convert them to appropriate java number type based on its dtype (for simple cases like int32, int64 at least). Because in examples like the one in the Stackoverflow question, its usual to use numpy for getting ranges, intervals etc to search, which will throw this error.
Hi, looking at here, I have compiled a list of supported estimators and transformers which don't work when the following parameters are wrapped with a numpy scalar:
Related to this stackoverflow question here: https://stackoverflow.com/q/49913330/3374996
System information:
Code to reproduce:
Expectation: PMML file is saved successfully without any error
Actual Output:
When running the java version:
Workaround: Replace the
np.int64
value with simpleint
in constructionknn = KNeighborsClassifier(n_neighbors=5)
Issue: I think that jpmml-sklearn should handle numpy scalars to convert them to appropriate java number type based on its dtype (for simple cases like int32, int64 at least). Because in examples like the one in the Stackoverflow question, its usual to use
numpy
for getting ranges, intervals etc to search, which will throw this error.Looks like this is the inverse case of [issue discussed here] (https://github.com/jpmml/jpmml-sklearn/issues/61)