guillermo-navas-palencia / optbinning

Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
http://gnpalencia.org/optbinning/
Apache License 2.0
434 stars 98 forks source link

Negative values can lead to Scorecard failure #284

Closed dslove closed 7 months ago

dslove commented 7 months ago

TBH I am not pretty sure if I use BinningProcess and Scorecard in the perfect way. But I encounter a problem:

If my training dataset has many -1, the scorecard.fit() will fail. More specifically, it will fail at this place:

        X_t = self.binning_process_.fit_transform(
            X[self.binning_process.variable_names], y, sample_weight, metric,
            metric_special, metric_missing, show_digits, check_input)

in scorecard.py. The failure is not a Python Exception or any other error, but that the above function will return an empty X_t, which will lead to Exception in following code.

With the same training dataset, if I replace all -1 with 0, the binning and scoring can run successfully.

I am not sure if this is a known issue or if I miss something. Any advice will be appreciated.

I use 0.18.0 BTW.

bmreiniger commented 7 months ago

Can you provide a minimal reproducible example?

dslove commented 7 months ago

Closing this ticket as the reason of this issue is found - I set a too strict selection criteria