MESMER-group / mesmer

spatially-resolved ESM-specific multi-scenario initial-condition ensemble emulator
https://mesmer-emulator.readthedocs.io/en/latest/
GNU General Public License v3.0
23 stars 18 forks source link

frozen distribution considerably slower #532

Open mathause opened 2 weeks ago

mathause commented 2 weeks ago

Expression.evaluate(...) creates a frozen scipy distribution (frozen means that the params, e.g. loc are fixed). Using a frozen instance instead of a non-frozen one is considerably slower. It's the setup of the frozen instance which is costly and takes about 501 μs (on my laptop). So if we create one instance and call it repeatedly this does not matter much, but if we set it up every time it adds up. The setup time seems relatively independent of the number of elements.

This could also offer some speed ups - especially in optimization routines - but will require a change in how the functions are called.

import numpy as np
import scipy as sp

loc = np.random.randn(100)
scale = np.abs(np.random.randn(100))
x = np.random.randn(100)
dist = sp.stats.norm(loc=loc, scale=scale)

%timeit sp.stats.norm(loc=loc, scale=scale).rvs()
%timeit sp.stats.norm.rvs(loc=loc, scale=scale)

%timeit dist.rvs()
%timeit dist = sp.stats.norm(loc=loc, scale=scale)
mathause commented 1 week ago

I think the way to go is to try and avoid using evaluate if possible and move the dist methods to the Expression class. So for example, ppf, isf, loglike etc... After #539 this should be relatively straightforward. Especially in the fitting routines, where we call evaluate repeatedly this should speed up everything considerably. I also think that logically these functions belong to the Expression class (which, however, may need a rename to include "distribution").

So where we would currently do:

distrib = self.expr_fit.evaluate(x, self.data_pred)
support = distrib.support()

We would define:

class Expression:

    ...

    def support(self, coefficients_values, inputs_values):

        params = self.get_params(coefficients_values, inputs_values)
        return self.distrib().support(**params)

and can then use:

support = self.expr_fit.support(x, self.data_pred)

[!IMPORTANT] This is secondary stuff

mathause commented 1 week ago

I just saw that I did not mention that sp.stats.norm(loc=loc, scale=scale).rvs() is 25 times slower than sp.stats.norm.rvs(loc=loc, scale=scale) - this is an easy win to speed up numerical optimization.

495 μs ± 1.5 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
20.1 μs ± 216 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
21.1 μs ± 101 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
446 μs ± 2.85 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Of course evaluate_params does add some overhead as well, so the speed-up is not quite as dramatic but still considerable.