jpmml / jpmml-sklearn

Java library and command-line application for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
531 stars 117 forks source link

Error casting numpy.core.Scalar to java.lang.Number in gradient boosting regressor learning rate #96

Closed moteleolu closed 5 years ago

moteleolu commented 5 years ago

@vruusmann Related to what you said here, and fixed here for nearest neighbors when I'm working with GradientBoostingTrees and BayesSearchCV, the learning rate gets wrapped as a numpy scalar and encoding the model fails.

SEVERE: Failed to convert
java.lang.ClassCastException: numpy.core.Scalar cannot be cast to java.lang.Number
    at sklearn.ensemble.gradient_boosting.GradientBoostingRegressor.getLearningRate(GradientBoostingRegressor.java:67)
    at sklearn.ensemble.gradient_boosting.GradientBoostingRegressor.encodeModel(GradientBoostingRegressor.java:59)
    at sklearn.ensemble.gradient_boosting.GradientBoostingRegressor.encodeModel(GradientBoostingRegressor.java:32)
    at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:215)
    at org.jpmml.sklearn.Main.run(Main.java:145)
    at org.jpmml.sklearn.Main.main(Main.java:94)
vruusmann commented 5 years ago

My integration testing code uses GradientBoostingRegressor directly, and the value of learning_rate attribute shows up on the Java side as java.lang.Number.

The transformation from Number to Numpy scalar must be happening because of BayesSearchCV.

Needs more systematic investigation and fixing.