Open LovelyBuggies opened 4 years ago
Scipy is not dependency-relied and could provide analyzing methods other than fitting, such as integration ... (though I am not sure whether they are of use for HEP). The points are: 1) It might not be specific as GooFit... 2) Using a Scikit-HEP package might be more, umm... HEP-ecosystemic.
hi @LovelyBuggies if you would like to have a histogram-based statistics model, https://github.com/scikit-hep/pyhf might be interesting and only depends on scipy + numpy
@lukasheinrich Thanks for your suggestions, I will dive into pyhf and see whether it is proper for the functionality in hist.
This is two separate issues: Shortcuts for easy interaction, and adaptors/integration into other packages (which could also be called shortcuts). In general, we should be able to implement some of them / many of them without adding a dependency on the package, though we will have to be careful when we do.
I think we should focus on how to "feed" our histograms to these other packages. Maybe come up with a standard histogram API? Then boost-histogram (and maybe others, like Physt) could also support it.
One thing that might be important for all but the most simple clients is feeding a structured set of histograms. I started some work along those lines with @jpivarski with histbook and the idea of a "book" / nest-able structure of histograms would be useful. cc @matthewfeickert @kratsg
@lukasheinrich An initiative concerning 'nest' was put forward here.
What exactly is the problem with iminuit
's interface? What is not pythonic enough about it? iminuit has little in common with the interface of C++ MINUIT, it is pretty pythonic already.
Besides, if you like scipy.optimize.minimize, you may also like https://iminuit.readthedocs.io/en/latest/reference.html#iminuit.minimize
@lukasheinrich boost-histogram supports integer and category axes, which can be used to bundle histograms together. I use these axes to have a common histogram with signal, background, different data subsets, etc. What can histbook do that boost-histogram with these axes cannot do?
@LovelyBuggies I disagree with your initial list of "shortcomings". GPU support is not a problem, it is a feature. Any package that supports the GPU should also fall back to CPU computing when GPUs are not available, of course, like numba and jax.
I hope you got from my previous comment that we cannot replace iminuit with scipy.optimize.
"We expect a less dependent, more pythonic solution for common use." Having well-justified dependencies is ok, if they can be loaded from PyPI and installed automatically. jax and jupyter are high-quality software and they depend on a gazillion of other packages.
@HDembinski Thanks for the correction! Looks like I misunderstand them: integrating iminuit to Hist is feasible and reasonable.
@HDembinski yes some of these axes types are perfectly suitable. Would 'jagged' data work as well? Consider this case: 2 phase phase region (one has data, bkg histoograms with 10 bins), the other has [data, signal, bkg] histograms with 5 bins
2 event categodies / \
/ \
2 samples / | / | \ 3 samples
/ | / | \
10 bins | | | | | 5 bins
@henryiii I have some tries and make a new demo concerning this topic HERE :)
We can encapsulate the work into funcs like h.to_numpy()
, e.g., h.to_aghast()
, h.to_mplhep()
, h.to_root()
, etc.
@henryiii We are going to add some shortcuts for analysis. Could you please specify which kinds of analysis are needed? And what tools or packages do you think are proper?
There are problems with the above fitting models: GPU-oriented, C++ based, and externally dependent relied. We expect a less dependent, more pythonic solution for common use. I recommend Scipy. Scipy's optimizer module gives us the flexibility to solve problems related to fitting and other data analysis (though it may not perform as well as the more specialized solutions like maximum-likelihood fits).
In addition to this, it is not clear whether our shortcuts should include classification, regression, clustering, etc. (I did not find any questions on the channel.) If yes, scikit-learn could be a wonderful solution.