metrics-rs / metrics

A metrics ecosystem for Rust.
MIT License
1.1k stars 152 forks source link

Find a better design for `Registry` to support generic keys and memoized hashing. #272

Closed tobz closed 2 years ago

tobz commented 2 years ago

As mentioned in #271, the new Registry design we implemented to support memoized hashing of keys (among other things) switched from allowing a generic key K to specifically requiring metrics::Key.

For certain use cases, this is suboptimal because end users may specifically want to use a custom key type after translating from metrics::Key on the Recorder frontend to whatever they want to use on the backend.

We should be able to find a design that allows us to constrain K, as well as sufficiently describe it, to allow Registry to understand what hasher it needs to use when rehashing. Practically speaking, Hashable gets us halfway there, we just need a little more information to let the key type also tell us what hasher to use if we need to rehash.

loyd commented 2 years ago

Thanks for addressing the problem.

we just need a little more information to let the key type also tell us what hasher to use if we need to rehash

Do you have an example of why a hasher should be customized instead of just exporting the used one by metrics?

loyd commented 2 years ago

Also, I've found that the previous implementation is not composable enough with additional information:

struct SomeKey {
    key: Key,
    additional_info: X,
}

impl Hashable for ExtKey {
    fn hashable(&self) -> u64 {
        // want to get a hash of `key`
        // and add hash of additional information
        // to get a new hash value
    }
}

Maybe it's better to use a hasher with the commutative property, or force hashing Key firstly. For instance, by getting incomplete hasher from Key:

self.key.hash()
    .append(self.additional_info)
    .finish()

Previously, I solved this by using double hashing, which increases the probability of collisions and still requires hashing on the hot path.

tobz commented 2 years ago

Do you have an example of why a hasher should be customized instead of just exporting the used one by metrics?

As you pointed out in #271, part of the value of Registry being generic over the key type was that it gave you the ability to use the code in a way that didn't fit my idea/metrics idea of how metrics should be referenced.. so my thought is that if there's an easy way to do so -- maybe even with a default value that uses the hasher from metrics -- that it would be nice to give users the ability to specify a custom hasher, as well.

Maybe they want to use a hash algorithm that has better performance on a specific CPU architecture, or they have a specific need for a DoS-resistant hashing algorithm, etc.

Also, I've found that the previous implementation is not composable enough with additional information:

Yeah, that's a fair point. Hashable is tricky because the whole point is that it's meant to be used for returning a memoized hash, so having to pass a hasher to it, or having it pass back the hasher, involves extra steps that the Registry really shouldn't need to care about.

Maybe there should just be a default implementation that uses Hash under-the-hood, so that derived implementations of Hash let it work automatically. Almost the same as deriving PartialEq but then doing an empty impl Eq for MyType {} block.

loyd commented 2 years ago

I've totally forgotten to thank you for the change, it seems to be much more flexible now!