Open jamadden opened 8 years ago
Seems reasonable to me to experiment. Maybe distributing it separate package would keep the focus tight. We might need to add a hook (an environment variable, maybe?) to let people configure which cache implementation to use, at least for benchmarking purposes.
Distributing it (where "it" is an implementation of the PickleCache) as a separate package might be difficult, at least as far as the C version goes.
The CFFI version could be built and distributed separately, but that's always going to have overhead that a pure C implementation doesn't (although in RelStorage, the CFFI implementation is quite a bit faster than the CFFI implementation currently shipping with persistent).
Distributing a C version could be possible, but because it couldn't use the CPersistentRing
struct that's embedded in a persistent object (the struct definition is quite different), we'd lose quite a lot of the benefit of that, and it would complicate memory management 😢 (CPython memory management is something I know relatively little about).
I could probably implement it here and run zodbshootout, but unfortunately that's not a very realistic workload (although RelStorage's cache did show notable improvements)
Drive by :) comments:
Over in RelStorage we've been having a discussion based on the observation that strict LRU isn't necessarily a good cache policy for varying workloads.
Basically, if most queries/transactions/requests use a certain set of objects, those objects shouldn't be evicted from the cache just because an outlier request comes in that scans a different BTree. LRU is prone to that problem. More adaptive approaches aren't.
@ben-manes pointed us to a policy that can be near optimal for a variety of workloads.
I think that would be a good fit for the PickleCache.
I have a C and CFFI implementation that I'm using in RelStorage. We may or may not be able to directly share the code, I don't know (it'd be cool to split this off as its own library on PyPI and distribute binary wheels just for it so I don't have to do that for RelStorage), but we could at least start with it.
Other notes:
incrgc
) and tuning become simpler, as the cache always stays at its preferred size, automatically, based on the frequency and LRU-ness of the data. We know exactly which items are good to evict from the cache. Thedrain_resistance
goes away.