Breaking changes and improvements

jonas-eschle commented 5 years ago

Following a proposal for breaking changes. Mostly thoughts but realistic. Feel free to comment, but not urgent.

Regarding the API, restrictions and so on, my general idea is to be a little bit more pythonic:

There should be one-- and preferably only one --obvious way to do it.

together with

we're all consulting adults here

or: be less restrictive on what you can to (e.g. different norm ranges for different pdfs inside a sum) but be more consistent with the examples, so that there is one way to do things (e.g. always create a Space with obs and limits, use it for both pdf, data).

This means that more advanced usecases can be easily covered while simple things will still be done right (as copy-paste and write-similar-code is anyway expected to be a lot of the simpler cases).

"chances" are my view of how likely it is to implement it really (how much I am personally convinced)

Implemented:

[ ] fix norm_range
[ ] space and norm_range
[ ] pdf vs func
[x] class instead of operators
[x] light extended
[x] sumpdf not autoextended
[ ] stricter shapes
[x] flexible spaces
[x] constant parameter
[x] easier debugging
[ ] enforced param limits

norm range as part of a pdf

chances: 98% idea: treat norm_range as a "fixed parameter" of a pdf (which it acutally is, the normalization factor).

consequences: no use of norm_range in pdf.pdf(x) # no norm_range anymore but exclusively use the set_norm_range A sum_pdfs daughters can have different norm_ranges. Using set_norm_range(norm_range, propagate=True) allows to (by default True?) also set the norm_ranges of the daughters. in the loss, data_range and norm_range don't have to agree. fit_range could be used to force the same range for both (if even needed still) norm_range is an additional argument to the __init__ (though defaults to space.

Strong distinction between space and norm_range

chances: 90% idea: each model lives in a space, pdfs have additionally a norm_range

Every model lives in a space with limits (not enforced?). This makes plotting easier, sampling etc because every model has limits it is supposed to be in. And they have not to coincide with the norm_range. This also allows to set everything outside the space to 0 (no default behavior! But a pdf could implement itself like that if it wants to.

Model vs pdf vs func

chances: 70% idea: make clearer what is what and maybe use func in pdf as well/ model as the base

Probably it is cleaner to create a Model for a func that is gonna be used as a pdf as well. Or have a ConvertToPDF class that explicitly converts (yes, similar to what we have with as_pdf, more of an internal change). Another idea is to add the func method to pdfs as well, returning the unnormalized_pdf in the right format and let the caller decide whether to use it as a pdf or as a func. Concern: is this always clear? (it is often: loss knows whether a pdf or func, SumPDF resp SumFunc as well...).

Difference from func to pdf: pdf returns a 1-D array (nevents,) while func returns a 2-D array (nevents, n_obs) or equivalent (zfit.Data -> behaves the same).

More class interface usage for pdfs

chances: 60% idea: propagate usage of classes instead of python operators

Also with the model vs pdf vs func discussion, it may be even more unambiguous what the intentions are when e.g. doing frac1 pdf1 + frac2 pdf2. Doing a bigger sum e.g. leads to odd structures (recursive sums, so SumPDF of pdf1 and SumPDF of pdf2 and SumPDF etc. ). To be cleaner, SumPDF should be used throughout.

Behavior of extended pdf

chances: 90% idea: reduce the impact of an "extended" pdf to a non-extended one only to an attribute chance (has a yield or not, but integral etc stays the same)

In order to make things less complicated, e.g. the integral should be the same whether a pdf is extended or not. The difference is only whether a pdf has a yield or not.

Also the loss may only makes a difference with a flag extended=True/False?

SumPDF does not extend automatically

chances: 80% idea: whether n or n-1 coeffs are given, the sumpdf does not extend but will use it as fracs.

If a SumPDF sums extended pdfs, the yields (normalized) are used as fracs. If non-extended pdfs are summed, the fracs are used, pdfs remain unextended. If there are n fracs, maybe we automatically scale them to the sum(fracs)? I suggest to leave that to the user (more freedom)(latter also supported by Matthieu).

Stricter shapes

chances: 90% idea: define consistent, stricter shapes for data

To be able to use part integral, do batched integrals etc, we need to have the shape well defined. The idea is to stick to the same (because works well) as TensorFlow Probability: n_events, (batch_shape), n_obs where batch_shape can be left out

This allows to: always access an event in the 0th axis and the given obs in the last axis (-1). With tf.gather it is easy to access them and in general to use this as arguments (e.g. for tf.reduce_sum. The axes in between 0 and -1 can be used as batch axes, more explicitly an axis at -2 should be added. zfit will maybe provide a convenient slicer to treat it as an implementation detail (e.t. zfit.get_axis(tensor, int) with int the number of the axis to access. And/or a zfit.obs_axis, zfit.event_axis returning simply 0 or -1.

Additionally, a mechanism to specify the out_obs in a function could allow it to return Data to be even more clear with obs and axes.

Flexible, heavier Space

Chances: 60% (but later) idea: allow arbitrary shaped limits

Rectangular limits are fine, but not complete. E.g. in Amplitudes we have a phasespace limit of our pdf, so a differently shaped Space would make sense to: normalize over the right area (currently we assume the rectangular area for normalization which is not actually true AFAIK). Maybe have a sampler attached in there, e.g. a grid_sampler (for a grid), random_sampler and an importance_sampler (for integration).

This would require:

to have a method inside or similar that tells whether points are inside the limits
something like rect_limits that return the min, max in each axis (for plotting e.g.)
more logic inside since a non-trivial function will be inside. This even comes close to "wrapping a pdf"?

Why: for flexibility. Not an urgent feature but probably what completes the spaces next to Space (and VectorSpace). It is more in the philosophy of: don't create one as an average user but a user (or we) still has the possibility to do it.

Autoconvert float to ComposedParameter

Chances: 99% idea: create a FixedParameter (or alternative name?) by default from floats

Currently, if a float is given in place of a parameter for a pdf, this is converted to a zfit.Parameter but fixed. So it can be varied later on. This though slows things down and I don't think there is a usecase for it. So the behavior does not change a lot. Except: if you want a parameter to be floating at some point, you need to create a zfit.Parameter.

Flexibility and Debugging vs Performance

Chances: 80% idea: stick to more flexibility and debugging then to performance as a decision.

We can run a minimization either by calling a sess.run after each step (as done with Minuit and needed for any external minimizer) or, in principle, in pure tf with a tf.while loop. TensorFlow is very efficient inside a single session run. E.g. it finds common subexpressions, caches etcetc. With control flow such as tf.while and tf.cond (if-else), a lot can be expressed as a single graph (and therefore run in a single sess.run. While with the potential of being more performant (3-4x max my estimation), it comes at a huge cost: anything "dynamically" has to be implemented in the graph, such as:

dependent caching, e.g. an integral that is invalidated if certain parameters change
only a minimizer fully implemented with TensorFlow could work (since the minimization step is then an op and the minimization is done in a while loop. Works with tf.optimizers).

We cannot have both. The dynamic choice is as fast as it is now. The advantages of this are

flexibility: between each session run, python can be used to update any variable, run any checks etc. Less stressful and easier from a user point of view. And it allows external minimizers to run.
debugging: this would allow more easily to do everything in lazy mode. My impression is that debugging is the hardest part of zfit (mostly for the user).

The sole disadvantage is performance. Caching has to be implemented completely manually and may not be as efficient. Having both options won't work, I strongly tend to the more user friendly flexible version.

Remove enforced limits in parameters

Chances: unknown idea: Don't allow pushing limits in parameters but just use them for validation

Currently we have enforced limits in parameters (clipped) and give them directly to minimizers. E.g. Minuit then basically alters their values with a transformation. The question here is should we really do that? Isn't it cleaner to have limits only to tell whether a parameter is in the valid ranges or not? Or even throw an error if an invalid value (outside the limits) is set?

I question the use case:

either the parameter is not at the limits. Then every thing is fine and "no limits" are needed.
or the value is set outside of the limits. In this case, strong action is required: set the parameter completely new or abort or whatever. But if a minimizer goes outside of the limits with a parameter, something is already wrong and silencing (by "clipping" through transformation) does not seem to be the right thing in any case to me.

But I clearly lack the experience and specific knowledge on minimizer strategies and the usefulness of limits. Would be good to talk with an expert at some point about it (or here if anyone knows that well).

marinang commented 5 years ago

About the conversion of float to ComposedParameter, wouldn’t it be confusing for the user?

jonas-eschle commented 5 years ago

A ComposedParameter is basically any kind of tensor (whereas floats are just a "special kind" of tensors). But we could have a FixedParameter that better reflects that behavior. Or a flag: guarantee_const or similar.

But a kind of conversion should happen since it makes dependency handling way easier. So we can rely on the fact that any parameter is a ZfitParameter object and has e.g. a name, dependents etc.

Would you prefer a pure number? Or how would you suggest to make the autoconversion clearer to the user? A good named class like "FixedParameter" or "AutoParameter" or similar?

marinang commented 5 years ago

I am not against a conversion. ComposedParameter might be worrisome ("with what my parameter is composed ?!") and FixedParameter or AutoParameter are clear names.

Otherwise for norm range as part of a pdf, Strong distinction between space and norm_range, SumPDF does not extend automatically I agree with the proposed changes.

For SumPDF does not extend automatically, if the user provides n fractions he has to make sure they sum up to 1.0 .

jonas-eschle commented 5 years ago

Perfect, fully agree!

zfit / zfit-development