Closed dmitry-lesnik closed 7 months ago
Hi @dmitry-lesnik.
First, this seems a scikit-learn issue (KBinsDiscretizer is used under the hood).
Second, the default prebinning_method
is more robust and convenient, and I recommend only using quantile when the number of distinct values in x is large.
If X is integer valued, method "quantile" fails to identify the upper bin.
In the code below there must be 3 perfect bins, corresponding to X-values 0, 1 and 2 However the method merges bins "1" and "2". Making X float and shifting one of the largest values by 1e-7 fixes the issue (but this is a hack, not a solution)