iamDecode / sklearn-pmml-model

A library to parse and convert PMML models into Scikit-learn estimators.
BSD 2-Clause "Simplified" License
76 stars 15 forks source link

fix threshold vs split_value in getter #13

Closed kodonnell closed 4 years ago

kodonnell commented 4 years ago

Fixes #12. Note that the internal use of split_value should probably be removed completely as per sklearn, but that can be a separate PR.

iamDecode commented 4 years ago

Hey @kodonnell, thanks for your contribution!

The reason I use split_value is because the library supports categorical variables, whereas scikit-learn does not. This code is based on https://github.com/scikit-learn/scikit-learn/pull/12866.

I agree threshold should return something more sensible, but casting to a float will not show the right value for categorical variables.

I suggest returning a structure instead, that when casting to a string will yield either the threshold, or the indexes of categories.

iamDecode commented 4 years ago

Superseded by 3214b291410acbb5e0abf44dbd289670ea674caa