guillermo-navas-palencia / optbinning

Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
http://gnpalencia.org/optbinning/
Apache License 2.0
435 stars 98 forks source link

Binning with Equal Counts #238

Closed rakshitrao99 closed 1 year ago

rakshitrao99 commented 1 year ago

Hi!

Is there any way in which I can fix that the number of bins such that in each bin the number of counts remains the same (normally known as equal frequency binning). And this number of bins is user defined. Same as what we normally do in Pandas using cut and qcut functionality.

guillermo-navas-palencia commented 1 year ago

Hi @rakshitrao99.

You should be able to produce such binning using the prebinning_method="quantile" and monotonic_trend=None.

rakshitrao99 commented 1 year ago

Hi!! Is there any way we can define the number of bins? I am applying the same strategy as you mentioned but still getting only one bin which is from (-inf, inf), instead I want 10 bins. Kind of like this:

  | Bin | Count | Count (%) | Non-event | Event | Event rate | WoE | IV | JS -- | -- | -- | -- | -- | -- | -- | -- | -- | -- (-inf, inf) | 244484 | 0.337092 | 116218 | 128266 | 0.524640 | -0.802711 | 0.236761 | 0.028825 Special | 0 | 0.000000 | 0 | 0 | 0.000000 | 0.0 | 0.000000 | 0.000000 Missing | 480789 | 0.662908 | 369055 | 111734 | 0.232397 | 0.490752 | 0.144748 | 0.017914   | 725273 | 1.000000 | 485273 | 240000 | 0.330910 |   | 0.381509 | 0.046739

Thanks for the help!!!

guillermo-navas-palencia commented 1 year ago

Could you provide data or code to reproduce it?

guillermo-navas-palencia commented 1 year ago

Re-open if you can provide a reproducible example. Thanks.