Open jonas-eschle opened 5 years ago
About the conversion of float to ComposedParameter, wouldn’t it be confusing for the user?
A ComposedParameter
is basically any kind of tensor (whereas floats are just a "special kind" of tensors). But we could have a FixedParameter
that better reflects that behavior. Or a flag: guarantee_const or similar.
But a kind of conversion should happen since it makes dependency handling way easier. So we can rely on the fact that any parameter is a ZfitParameter
object and has e.g. a name, dependents etc.
Would you prefer a pure number? Or how would you suggest to make the autoconversion clearer to the user? A good named class like "FixedParameter" or "AutoParameter" or similar?
I am not against a conversion. ComposedParameter
might be worrisome ("with what my parameter is composed ?!") and FixedParameter
or AutoParameter
are clear names.
Otherwise for norm range as part of a pdf, Strong distinction between space and norm_range, SumPDF does not extend automatically I agree with the proposed changes.
For SumPDF does not extend automatically, if the user provides n fractions he has to make sure they sum up to 1.0 .
Perfect, fully agree!
Following a proposal for breaking changes. Mostly thoughts but realistic. Feel free to comment, but not urgent.
Regarding the API, restrictions and so on, my general idea is to be a little bit more pythonic:
together with
or: be less restrictive on what you can to (e.g. different norm ranges for different pdfs inside a sum) but be more consistent with the examples, so that there is one way to do things (e.g. always create a Space with obs and limits, use it for both pdf, data).
This means that more advanced usecases can be easily covered while simple things will still be done right (as copy-paste and write-similar-code is anyway expected to be a lot of the simpler cases).
"chances" are my view of how likely it is to implement it really (how much I am personally convinced)
Implemented:
norm range as part of a pdf
chances: 98% idea: treat norm_range as a "fixed parameter" of a pdf (which it acutally is, the normalization factor).
consequences: no use of
norm_range
inpdf.pdf(x) # no norm_range anymore
but exclusively use theset_norm_range
Asum_pdfs
daughters can have differentnorm_ranges
. Usingset_norm_range(norm_range, propagate=True)
allows to (by default True?) also set the norm_ranges of the daughters. in the loss,data_range
andnorm_range
don't have to agree.fit_range
could be used to force the same range for both (if even needed still)norm_range
is an additional argument to the__init__
(though defaults tospace
.Strong distinction between space and norm_range
chances: 90% idea: each model lives in a space, pdfs have additionally a norm_range
Every model lives in a space with limits (not enforced?). This makes plotting easier, sampling etc because every model has limits it is supposed to be in. And they have not to coincide with the
norm_range
. This also allows to set everything outside the space to 0 (no default behavior! But a pdf could implement itself like that if it wants to.Model vs pdf vs func
chances: 70% idea: make clearer what is what and maybe use func in pdf as well/ model as the base
Probably it is cleaner to create a Model for a func that is gonna be used as a pdf as well. Or have a ConvertToPDF class that explicitly converts (yes, similar to what we have with
as_pdf
, more of an internal change). Another idea is to add thefunc
method to pdfs as well, returning the unnormalized_pdf in the right format and let the caller decide whether to use it as a pdf or as a func. Concern: is this always clear? (it is often: loss knows whether a pdf or func, SumPDF resp SumFunc as well...).Difference from func to pdf: pdf returns a 1-D array (nevents,) while func returns a 2-D array (nevents, n_obs) or equivalent (
zfit.Data
-> behaves the same).More class interface usage for pdfs
chances: 60% idea: propagate usage of classes instead of python operators
Also with the model vs pdf vs func discussion, it may be even more unambiguous what the intentions are when e.g. doing frac1 pdf1 + frac2 pdf2. Doing a bigger sum e.g. leads to odd structures (recursive sums, so SumPDF of pdf1 and SumPDF of pdf2 and SumPDF etc. ). To be cleaner, SumPDF should be used throughout.
Behavior of extended pdf
chances: 90% idea: reduce the impact of an "extended" pdf to a non-extended one only to an attribute chance (has a yield or not, but integral etc stays the same)
In order to make things less complicated, e.g. the integral should be the same whether a pdf is extended or not. The difference is only whether a pdf has a yield or not.
Also the loss may only makes a difference with a flag
extended=True/False
?SumPDF does not extend automatically
chances: 80% idea: whether n or n-1 coeffs are given, the sumpdf does not extend but will use it as fracs.
If a SumPDF sums extended pdfs, the yields (normalized) are used as fracs. If non-extended pdfs are summed, the fracs are used, pdfs remain unextended. If there are n fracs, maybe we automatically scale them to the
sum(fracs)
? I suggest to leave that to the user (more freedom)(latter also supported by Matthieu).Stricter shapes
chances: 90% idea: define consistent, stricter shapes for data
To be able to use part integral, do batched integrals etc, we need to have the shape well defined. The idea is to stick to the same (because works well) as TensorFlow Probability: n_events, (batch_shape), n_obs where batch_shape can be left out
This allows to: always access an event in the 0th axis and the given obs in the last axis (-1). With
tf.gather
it is easy to access them and in general to use this as arguments (e.g. fortf.reduce_sum
. The axes in between 0 and -1 can be used as batch axes, more explicitly an axis at -2 should be added. zfit will maybe provide a convenient slicer to treat it as an implementation detail (e.t. zfit.get_axis(tensor, int) with int the number of the axis to access. And/or azfit.obs_axis
,zfit.event_axis
returning simply 0 or -1.Additionally, a mechanism to specify the
out_obs
in a function could allow it to return Data to be even more clear with obs and axes.Flexible, heavier Space
Chances: 60% (but later) idea: allow arbitrary shaped limits
Rectangular limits are fine, but not complete. E.g. in Amplitudes we have a phasespace limit of our pdf, so a differently shaped Space would make sense to: normalize over the right area (currently we assume the rectangular area for normalization which is not actually true AFAIK). Maybe have a sampler attached in there, e.g. a grid_sampler (for a grid), random_sampler and an importance_sampler (for integration).
This would require:
inside
or similar that tells whether points are inside the limitsrect_limits
that return the min, max in each axis (for plotting e.g.)Why: for flexibility. Not an urgent feature but probably what completes the spaces next to Space (and VectorSpace). It is more in the philosophy of: don't create one as an average user but a user (or we) still has the possibility to do it.
Autoconvert float to ComposedParameter
Chances: 99% idea: create a FixedParameter (or alternative name?) by default from floats
Currently, if a float is given in place of a parameter for a pdf, this is converted to a
zfit.Parameter
but fixed. So it can be varied later on. This though slows things down and I don't think there is a usecase for it. So the behavior does not change a lot. Except: if you want a parameter to be floating at some point, you need to create azfit.Parameter
.Flexibility and Debugging vs Performance
Chances: 80% idea: stick to more flexibility and debugging then to performance as a decision.
We can run a minimization either by calling a sess.run after each step (as done with Minuit and needed for any external minimizer) or, in principle, in pure tf with a tf.while loop. TensorFlow is very efficient inside a single session run. E.g. it finds common subexpressions, caches etcetc. With control flow such as tf.while and tf.cond (if-else), a lot can be expressed as a single graph (and therefore run in a single
sess.run
. While with the potential of being more performant (3-4x max my estimation), it comes at a huge cost: anything "dynamically" has to be implemented in the graph, such as:We cannot have both. The dynamic choice is as fast as it is now. The advantages of this are
The sole disadvantage is performance. Caching has to be implemented completely manually and may not be as efficient. Having both options won't work, I strongly tend to the more user friendly flexible version.
Remove enforced limits in parameters
Chances: unknown idea: Don't allow pushing limits in parameters but just use them for validation
Currently we have enforced limits in parameters (clipped) and give them directly to minimizers. E.g. Minuit then basically alters their values with a transformation. The question here is should we really do that? Isn't it cleaner to have limits only to tell whether a parameter is in the valid ranges or not? Or even throw an error if an invalid value (outside the limits) is set?
I question the use case:
But I clearly lack the experience and specific knowledge on minimizer strategies and the usefulness of limits. Would be good to talk with an expert at some point about it (or here if anyone knows that well).