scikit-hep / histbook

Versatile, high-performance histogram toolkit for Numpy.
BSD 3-Clause "New" or "Revised" License
109 stars 9 forks source link

User defined binning #40

Closed marinang closed 6 years ago

marinang commented 6 years ago

Hi Jim, Is custom binning on the list for the next releases? Matt

jpivarski commented 6 years ago

If you mean irregular binning (an increasing but otherwise unconstrained sequence of bin edges), that's the split axis type. Replace bin with split with appropriate arguments. There are examples on the README tutorial.

If you actually meant something else, reopen this issue and explain in more detail. Thanks!

eduardo-rodrigues commented 6 years ago

BTW, @marinang, if you need an algorithm to find you the best binning, then try https://github.com/scikit-hep/scikit-hep/blob/master/skhep/modeling/bayesian_blocks.py.

marinang commented 6 years ago

Thanks @jpivarski. @eduardo-rodrigues yes this why I asked this actually, I wanted to use it :D

jpivarski commented 6 years ago

I see. Adaptive binning is not on the histbook roadmap because I don't know of any algorithms that are associative— that you can divide the input data up arbitrarily, adaptively bin and fill on each subsample, and then combine results in such a way that it didn't matter how the input was divided into subsamples.

This bias of mine is inherited from Histogrammar, which was focused on parallel processing first— small, sequentially filled histograms were an afterthought. With histbook, the later may be more relevant. Would it be useful to have histbook integrate with the Bayesian Blocks package for strictly local, sequential filling? I'm not sure how that would interact with the n-dimensionalness of the Hist class.