There are four different use cases that need to be covered, namely the combinations from:
"binned" pdfs (e.g. from template) and pdfs
binned data and unbinned data
The focus is on the two use cases:
unbinned data with pdfs (what is currently in zfit): this is the classical usecase and for any non-high-statistics analysis with an analytic model preferred.
binned data with binned pdfs: The typical usecase for the large experiments as in ATLAS, CMS. Since this can be implemented highly efficient if the assumption of a binned pdf and a binned dataset (-> two arrays with couns) can be made. Any conversion from a continuous pdf to a binned representation would significantly slow down the computation.
Therefore, two kind of models and datasets are needed: unbinned and binned. This allows to implement the above cases independently and highly efficient. Conversion methods should be provided to convert e.g. a binned to an unbinned pdf, but the efficiency may be comparably low.
Vector Paramters
Furthermore, unbinned fits with parametric models typically contain up to max 100 parameters and a single object for each parameter is reasonable. However, for the binned case, a vectorized parameter (as other libraries use for this case) is necessary and discussed in https://github.com/zfit/zfit-development/issues/44
Limitations/Assumptions
Binning fixed/numpy. Adaptive binning (TF) can be added and used in places where any binning is accepted
The strategy for binned fits is discussed here.
Goals and reasoning
There are four different use cases that need to be covered, namely the combinations from:
The focus is on the two use cases:
Therefore, two kind of models and datasets are needed: unbinned and binned. This allows to implement the above cases independently and highly efficient. Conversion methods should be provided to convert e.g. a binned to an unbinned pdf, but the efficiency may be comparably low.
Vector Paramters
Furthermore, unbinned fits with parametric models typically contain up to max 100 parameters and a single object for each parameter is reasonable. However, for the binned case, a vectorized parameter (as other libraries use for this case) is necessary and discussed in https://github.com/zfit/zfit-development/issues/44
Limitations/Assumptions