Open mathause opened 1 month ago
I think the way to go is to try and avoid using evaluate
if possible and move the dist methods to the Expression
class. So for example, ppf
, isf
, loglike
etc... After #539 this should be relatively straightforward. Especially in the fitting routines, where we call evaluate
repeatedly this should speed up everything considerably. I also think that logically these functions belong to the Expression
class (which, however, may need a rename to include "distribution").
So where we would currently do:
distrib = self.expr_fit.evaluate(x, self.data_pred)
support = distrib.support()
We would define:
class Expression:
...
def support(self, coefficients_values, inputs_values):
params = self.get_params(coefficients_values, inputs_values)
return self.distrib().support(**params)
and can then use:
support = self.expr_fit.support(x, self.data_pred)
[!IMPORTANT] This is secondary stuff
self.expr_fit.support.from_params(**params)
self.expr_fit.support(coefficients_values, inputs_values, *, sanity_check=False)
I just saw that I did not mention that sp.stats.norm(loc=loc, scale=scale).rvs()
is 25 times slower than sp.stats.norm.rvs(loc=loc, scale=scale)
- this is an easy win to speed up numerical optimization.
495 μs ± 1.5 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
20.1 μs ± 216 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
21.1 μs ± 101 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
446 μs ± 2.85 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Of course evaluate_params
does add some overhead as well, so the speed-up is not quite as dramatic but still considerable.
Expression.evaluate(...)
creates a frozen scipy distribution (frozen means that the params, e.g.loc
are fixed). Using a frozen instance instead of a non-frozen one is considerably slower. It's the setup of the frozen instance which is costly and takes about 501 μs (on my laptop). So if we create one instance and call it repeatedly this does not matter much, but if we set it up every time it adds up. The setup time seems relatively independent of the number of elements.This could also offer some speed ups - especially in optimization routines - but will require a change in how the functions are called.