arogozhnikov / hep_ml

Machine Learning for High Energy Physics.
https://arogozhnikov.github.io/hep_ml/
Other
176 stars 64 forks source link

Strange values of bin edges #45

Open adendek opened 7 years ago

adendek commented 7 years ago

I found a very strange behaviour of the Lookup Classifier calculate bin edges feature. The issue occurs only for integer type feature.
Here you have exemplary features edges {"seed_nbIT", {0.0,0.0,0.0,}}, {"seed_nLayers", {11.0,12.0,12.0,}}, I attached the features distribution plots. xxx

arogozhnikov commented 7 years ago

Well, it is bad, but expected behavior. So far weighted quantiles are computed, which doesn't work fine with non-continous columns.

You can pass explicitly the edges to use for this variable in n_bins.