Closed luser closed 7 years ago
Thanks for the pull request, and welcome! The contain-rs team is excited to review your changes, and you should hear from @bluss (or someone else) soon.
If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.
The Measure
associated type is a nice touch, I like that the heapsize
meter doesn't recalculate the size on each insertion.
Concerns:
insert
is only declared for M: CountableMeter
then how are you expected to use the cache when it is uncountable? What are some examples of countable meters -- shouldn't you always be able to tell what the sum or difference of two Measure
s is?capacity
should really just set a maximum Measure
which is tolerable -- in the case of Count
this could preallocate but other viable use-cases will likely deal with variable-size items where this matters less.Alternatively we could declare
pub trait Meter<K: Hash + Eq, V> {
type Measure: Ord + Copy;
/// Accumulate a key-value pair into the current measure.
fn add(&mut self, key: &K, val: &V);
/// Remove a key-value pair from the current measure.
fn remove(&mut self, key: &K, val: &V);
/// Remove value measure (for when an old value exists)
fn remove_val(&mut self, val: &V);
/// Fetch the current measure.
fn current(&self) -> Self::Measure;
}
And then also a
// maybe doc(hidden)?
/// Meter where the current measure can be derived from the cache.
pub trait DerivableMeter<K: Hash + Eq, V> {
// ...all same as `Meter` except for this:
fn current<S: BuildHasher>(&self, cache: &LruCache<K, V, S, Self>) -> Self::Measure;
}
// trivial implementation
impl<K: Hash + Eq, V, M> DerivableMeter<K, V> for M where M: Meter<K, V> { ... }
Then the cache is defined as
pub struct LruCache<K: Hash + Eq, V, S: BuildHasher, M: DerivableMeter<K, V>> {
map: ...,
max_measure: M::Measure,
meter: M,
}
and insert
might look like
fn insert(&mut self, key: K, val: V) -> Option<V> {
self.meter.add(&key, &val);
let old_val = self.map.insert(key, val);
if let Some(ref val) = old_val {
self.meter.remove_val(val);
}
while self.meter.current(&*self) > self.max_measure {
self.remove_lru()
}
old_val
}
Count
's Measure
could still be ()
but the current
method would fetch from the underlying map.
edit: miswrote; this whole idea was designed around using usize
for Count::Measure
That Travis failure looks like a transient network issue: "Unable to update registry https://github.com/rust-lang/crates.io-index"
If insert is only declared for M: CountableMeter then how are you expected to use the cache when it is uncountable? What are some examples of countable meters -- shouldn't you always be able to tell what the sum or difference of two Measures is?
CountableMeter
is a bit of a strange name, I worked through several designs before winding up on that. Basically, I wanted Count
to not keep an extraneous counter on the cache, so I couldn't rely on having the Measure
type implement Add + Sub
, or be convertible to usize
. I could have put all the methods of CountableMeter
on Meter
itself, but then users implementing Meter
would have to implement them, which seems silly for the general case. I could rename CountableMeter
to something else, it really just exists to paper over the quirk of Count
not storing a count (which is pretty ironic in hindsight).
The intention was for LruCache
to only work with types implementing CountableMeter
, so if I left a M: Meter
on it somewhere that's a mistake!
I don't really see how your DerivableMeter
proposal differs much from my existing CountableMeter
implementation. Is there something there I'm missing?
max_measure: M::Measure,
FYI, I don't see how that could be made to work with Count
having the unit type. It still needs to store that as a usize
.
@luser yeah, in the suggestion above Count::Measure
== usize
, but it can fetch it by directly implementing DerivableMeter
and calling cache.map.len()
Can I reassign this to you @rphmeier?
Sure thing @bluss
@rphmeier There was some back and forth there, but are there specific changes you'd like to see to this PR? I'm happy to fix things, I just don't think I understand what you're looking for.
Ping @rphmeier
@luser has forked the repo for their purposes, and we don't seem to have the bandwidth for this kind of work anymore.
Closing.
This commit allows creating LruCaches that are limited in size by something other than a count of entries. It does this by providing a new trait,
Meter
, where implementations of the trait provide ameasure
method to do the measurement. The default behavior of measuring by the count of entries is provided by aCount
type.Due to difficulties with mutating entries making it possible to change an item's measure, the
get_mut
anditer_mut
methods are provided only on LruCaches usingCount
for sizing.A bit of optimization is provided for the default case--to avoid having to redundantly store the item count the
Meter
trait allows implementations to provide a type for storing the measurement, and theCount
implementation uses()
. TheCountableMeter
trait allowsLruCache
to work with this or anyMeter
implementation that usesusize
for measurement.Finally an optional feature is added,
heapsize
, which uses the heapsize crate to provide HeapSizeOf, a measurement based on the heap memory usage of values stored in the cache.Thanks to @rphmeier for the design suggestions!
Since this makes
get_mut
not available on caches that aren't using the defaultCount
metric, you will probably want to get non-mutable get merged as well to mitigate that.One issue that's not addressed by this change is providing the list of removed items on an
insert
orset_capacity
that requires removing multiple items. This would make my use case simpler, so I may implement that as a follow-up once this PR is finished.