Source-wise interpolation

JelleAalbers / blueice

Build Likelihoods Using Efficient Interpolations and monte-Carlo generated Events

BSD 3-Clause "New" or "Revised" License

9 stars 9 forks source link

Source-wise interpolation #46

Open hammannr opened 2 weeks ago

hammannr commented 2 weeks ago

If I have two sources a and b in a model, each with two shape parameters s_a0, s_a1 and s_b0, s_b1, the interpolation of pdfs and expectation values currently happens on a 4D grid (s_a0, s_a1, s_b0, s_b1). This can quickly and unnecessarily blow up the required memory.

This PR adds source-wise interpolation. For this, we keep track of which source listens to which of the shape parameters and can thus interpolate on two 2D grids (s_a0, s_a1) and (s_b0, s_b1).

For now, I added this as an option so the default is unchanged. So far, everything I tested seemed to work but I'll do more thorough checks in the next days.

hammannr commented 2 weeks ago

Ok it seems like I broke something -- very good to have unit tests! I'll fix it tomorrow.

hammannr commented 2 weeks ago

The reason for the now failing tests is that I moved the expected_events property to the source, which I think makes sense. There, I compute the number similar to how it was done before but using the source's config entry of livetime_days and rate_multiplier instead of the model's config entries. I think in any case it should not be recommended changing the config values after init. Instead, I'd understand the source.expected_events property as well as the model.expected_events() method as default values -- this is also how it's treated in the likelihood. If you call it with a different value of livetime_days it is simply scaled by this value. So my suggestion would be to adapt the unittest to match the new definition. Another solution would be to make source.expected_events a method with arguments livetime_days and rate_multiplier -- this way we could leave the current behaviour. What do you think @JelleAalbers @kdund ?

hammannr commented 1 week ago

So far, I mainly focused on the time and memory the likelihood needs to be initialized. I now noticed that running a fit with the source-wise interpolation is unfortunately even slightly slower than the previous implementation. The way I wrote the ps_interpolator I should not be too surprised, I think every python developer would punch me for this code. I tried improving it this afternoon but didn't succeed yet. I will continue trying tomorrow but if anyone has a good suggestion that would be highly appreciated! 😊

hammannr commented 1 week ago

After some more profiling I now think that the likelihood calls are not the bottleneck (I also couldn't find a way to speed this up in a meaningful way). Instead, I suspect minuit just doesn't love having many of these shape parameters, which make the likelihood unsmooth.