guillermo-navas-palencia / optbinning

Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
http://gnpalencia.org/optbinning/
Apache License 2.0
434 stars 98 forks source link

Get IV of each feature after applying MulticlassOptimalBinning in multiclass dataset (>2 labels) #290

Closed juyjuyy closed 5 months ago

juyjuyy commented 7 months ago

The dataset I used had three labels, and I applied MulticlassOptimalBinning for binning processing and then built binning_table. The binning_table has many values in the analysis report. Still, I could not understand the quality_score value which the formula in the paper "Optimal binning: mathematical programming formulation" has constraints with IV variable. Still, the MulticlassOptimalBinning class doesn't have any property or methods to help access IV in the multiclass classification problem.

Here is the picture for my code:

Screenshot 2023-12-10 161739

And the picture for a formula in the paper:

Screenshot 2023-12-10 153721

Screenshot 2023-12-10 153732

guillermo-navas-palencia commented 7 months ago

Hi @juyjuyy.

Note that binning quality score for multiclass target is slightly different: https://github.com/guillermo-navas-palencia/optbinning/blob/master/optbinning/binning/metrics.py#L347. It replaces the IV with the normalized Jensen-Shannon divergence. The js property can be retrieved from the multiclass binning table: https://github.com/guillermo-navas-palencia/optbinning/blob/master/optbinning/binning/metrics.py#L347.

juyjuyy commented 6 months ago

Thank you for your response @guillermo-navas-palencia. After executing the analysis, I have known the properties of binning_table, but I don't get it now. I'm not too good at math, so I don't know the difference between the normalized Jensen-Shannon divergence applied to js and the Jeffrey divergence applied to iv.

juyjuyy commented 6 months ago

Can you suggest some documents related to the problem I have? @guillermo-navas-palencia

guillermo-navas-palencia commented 6 months ago

I use the Jensen Shannon for binary and multiclass. See https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence. image

Yes, you can rank by JS. Both are divergence measures.

juyjuyy commented 5 months ago

Thank you for your response @guillermo-navas-palencia ,

  1. I wonder why both IVand JS value exists in the binning table with binary classification, but the multi-class classification exists only JS. Could you explain the theory for that? Is IV not used for multi-class problem?
  2. I can use the rule of thumb applied for IV to choose features such that IV above 0.1 is the strong predictor. If JS is the value calculated for ranking, how can I know which value I can choose for feature selection? Please help me, I really need your reply.
guillermo-navas-palencia commented 5 months ago
  1. The IV is a divergence measure only suitable for binary target. The JS divergence generalizes the IV allowing multiple categories (multi-class problem).
  2. The IV, unlike JS, is unbounded. I found, experimentally, that IV is commonly 5-10 times larger than JS for a binary target. Therefore, a value above 0.02 might work, although I suppose that depends on the number of classes. This is a problem I haven't investigated.