Closed combiz closed 5 months ago
Hi @combiz.
The transform
method has a couple of parameters you can have a look at, see: https://gnpalencia.org/optbinning/binning_binary.html#optbinning.OptimalBinning.transform. The parameters are metric_special
and metric_missing
, by default both are set to 0. To use the actual WoE values for special just set metric_special="empirical"
. The default value is not automatically set to "empirical" because it might produce infinite IV if there are no special or missing values.
I close this issue, please re-open if the explanation was unclear.
Firstly, thank you for the fantastic package.
I've noticed a bug with special codes where a WoE value of 0 is calculated by the
binner.transform
function on special code features when the metric = "woe". Other non-special code bins are calculated correctly. This can go undetected as the special code is handled correctly elsewhere. For example, with metric = 'bins' the transformation returns the correct bins including those for special codes, thebinner.binning_table.build()
correctly shows the bins and their values and the correct WoE values (non-zero), and thebinning_table.plot()
shows correct bin assignments and WoE values.e.g.
pd.DataFrame(binners["FEATURE"].transform(df["FEATURE"], metric = "woe")).value_counts()
This is despite the WoE=0 values being correctly assigned a special code using the equivalent command
pd.DataFrame(binners["FEATURE"].transform(df["FEATURE"], metric = "bins")).value_counts()
and the binning_table showing the correct WoE for this special code is non-zero.Apologies I don't currently have the bandwidth for a full reprex but hopefully this helps.