guillermo-navas-palencia / optbinning

Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
http://gnpalencia.org/optbinning/
Apache License 2.0
434 stars 98 forks source link

Feature Request : 2D Binning when one of the features is missing. #269

Open phruekc opened 9 months ago

phruekc commented 9 months ago

Hi there,

As far as I've tested, currently, if we fit the OptimalBinning2D and one of the features - say feat_x is missing, the sample directly goes to {"Bin x":"Missing" ,"Bin y":"Missing"} line of the binning table. However, such observations may contain more hidden information if the BinningTable2D supports binning like this ;

|       Bin x |       Bin y | ...
-----------------
| (-inf, 100) | (-inf, 100) | ...
| [ 100, 200) | (-inf, 100) | ...
| [ 200, inf) | (-inf, 100) | ...
|     Missing | (-inf, 100) | ...
| (-inf, 100) | [ 100, 200) | ...
| [ 100, 200) | [ 100, 200) | ...
| [ 200, inf) | [ 100, 200) | ...
|     Missing | [ 100, 200) | ...
| (-inf, 100) | [ 200, inf) | ...
| [ 100, 200) | [ 200, inf) | ...
| [ 200, inf) | [ 200, inf) | ...
|     Missing | [ 200, inf) | ...
| (-inf, 100) |     Missing | ...
| [ 100, 200) |     Missing | ...
| [ 200, inf) |     Missing | ...
|     Special |     Special | ...
|     Missing |     Missing | ...

I'm not sure whether this already can be done or not.

Big thanks,

Ps. I love Optbinning very much, and it does really make my life easier. :)

guillermo-navas-palencia commented 9 months ago

Hi @phruekc. Thanks for your proposal. Feel free to develop such a feature if you find the time :), happy to review a PR.