LRU cache - Githubissues

jamievicary commented 3 years ago

An idea that Lukas and I discussed yesterday: use a LRU ("least recently used") cache to cache results of all core function calls (get slices, normalization, restriction, etc.) Given that we already hash the core structures themselves this should be relatively easy to add.

My understanding of what this means: when a certain function is first called on a particular input, we cache the result, using it for the return value for future calls. The LRU feature trims the cache occasionally, removing the least recently used members.

This lightweight crate looks relevant:

https://docs.rs/lru/0.6.5/lru/

NickHu commented 3 years ago

I think the idea is more that the cache is a fixed size, and least recently used entries get evicted to make room for new entries, rather than trimming. The idea is that the more recently a particular function call occurs, the higher the probability its result is already cached. In practice, the user is triggering the same function calls over and over in roughly the same stage of whatever they're proving, so it shouldn't need to be calculated multiple times

NickHu commented 3 years ago

This crate looks like it pretty much does this https://crates.io/crates/memoize (under Further Functionality)

jamievicary commented 3 years ago

@NickHu, thanks, that makes sense.

jamievicary commented 3 years ago

An important point is that if we've had something like this in place, Lukas's recent optimization for caching slices would have been unnecessary. I bet a caching system like this would yield further performance increases for us across-the-board.

zrho commented 3 years ago

While I agree that something like this can be useful, I think we should be more selective about this than applying it to each and every function lest we cache a function that is comparably fast to compute as a hash lookup and waste memory on top. In the case of the performance fix for type checking, we were calling the restrict_rewrite function literally hundreds of thousands of times to type check a single higher-dimensional contraction, so we saw a massive performance increase when memoization reduced that to only a several thousand calls. This is quite a different scenario to shaving off microseconds by caching some non-recursive function.

jamievicary commented 3 years ago

Agreed this should be applied selectively.

On Fri, Apr 30, 2021 at 1:21 PM Lukas Heidemann @.***> wrote:

While I agree that something like this can be useful, I think we should be more selective about this than applying it to each and every function lest we cache a function that is comparably fast to compute as a hash lookup and waste memory on top. In the case of the performance fix for type checking, we were calling the restrict_rewrite function literally hundreds of thousands of times to type check a single higher-dimensional contraction, so we saw a massive performance increase when memoization reduced that to only a several thousand calls. This is quite a different scenario to shaving off microseconds by caching some non-recursive function.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/homotopy-io/homotopy-rs/issues/67#issuecomment-830056053, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQ4OHWCNQFOLYPHLO2ZQSLTLKOF3ANCNFSM4332O5MQ .

doctorn commented 3 years ago

This framework was designed by the Rust compiler team lead to support this kind of operation inside of the Rust compiler:

https://github.com/salsa-rs/salsa

Might be worth having a look at! (Also, the DPhil I'm about to accept is all about categorical models of incremental computation so I'd love to look at this aspect of the implementation!)

homotopy-io / homotopy-rs

LRU cache #67