SyneRBI / SIRF

Main repository for the CCP SynerBI software
http://www.ccpsynerbi.ac.uk
Other
59 stars 29 forks source link

Caching mechanism for `AcquisitionModel` norm #1243

Open paskino opened 5 months ago

paskino commented 5 months ago

When the norm of an AcquisitionModel is called it gets calculated and returned. However, once called again the norm is recalculated and as the value is not cached. This can lead to a consistent increase of computation time.

I made a thin wrapper for SIRF MR AcquisitionModel as below.

https://github.com/paskino/SIRF-Contribs/blob/b8e542d4f7af1ed1e47eaed8517d4be0517aca17/src/notebooks/utils.py#L147-L162

Python already provides a caching mechanism: https://docs.python.org/3/library/functools.html#functools.lru_cache

KrisThielemans commented 5 months ago

Ideally, this caching happens at the C++ level. However, the cache should be invalidated when a set_ function is called. Possibly easiest to check that is if set_up() has been called. (In STIR, we have an already_set_up member for this.)

Unfortunately, anything with caching gets complicated in a multi-threaded context. We don't seem to have any in SIRF though, so we probably don't care yet.

So, something like

bool already_setup = false;
// if negative, it hasn't be computed yet
double cached_norm = -1.;

double norm() const
{
  if (cached_norm < 0)
    cached_norm = norm_calculation
  return cached_norm;
}
evgueni-ovtchinnikov commented 5 months ago

Do not like the idea, I am afraid - we will need to invalidate the cache in every non-const method.

KrisThielemans commented 5 months ago

On the C++ side, we know what the non-const methods do, so we know when we have to invalidate (and that would just be when the underlying model changes). A bit painful, but not too much, I think.

On the Python side, const doesn't exist. People can mess up member variables as much as they want (people are supposed to be well-behaved!). One more reason not to do this in Python...

KrisThielemans commented 5 months ago

For PET, there's maybe 8 methods that invalidate the cache (changing the additive/background term doesn't change the norm, but fine).

So, the strategy would be to

Not painless :-( but far safer for the set_up part anyway (i.e. we need to do that to prevent crashes I think).

evgueni-ovtchinnikov commented 5 months ago

what if the user is after a more accurate computation of the norm, and just wants to re-run it with a larger number of iterations? (Or different subset or subsets' number.)

I believe it is the user's code responsibility to decide whether to use already computed norm or compute it again.