allegro / bigcache

Efficient cache for gigabytes of data written in Go.
http://allegro.tech/2016/03/writing-fast-cache-service-in-go.html
Apache License 2.0
7.45k stars 593 forks source link

feature: Support uint64 keys without converting to string then back again #380

Open coxley opened 10 months ago

coxley commented 10 months ago

Hey folks!

I have a use-case where we already use xxh3 for hashing various things, including the key for a previous cache's key. We're using bigcache now, but it's a bit awkward doing strconv.FormatUint and then configuring a dummy Hasher to do strconv.ParseUint.

Is there a reason this is omitted?

janisz commented 10 months ago

Hey, could post a sample code. I don't get the question?

coxley commented 10 months ago

@janisz Sure thing.

We're already hashing obj in different places, and use the same hashing approach for other use-cases as well. So I'm opting to format/parse instead of add a new unexpected hasher. But would be nice to just reuse the uint64 I already have.

package main

import (
    "context"
    "fmt"
    "strconv"
    "time"

    "github.com/allegro/bigcache/v3"

    pb "github.com/myco/protos/foobarbaz"
)

func main() {
    cache := NewBigCache()

    obj := getObject()
    key := hashObject(obj)

    // Because cache.Set() takes a string, now I have to convert
    cache.Set(strconv.FormatUint(key, 10), obj)
}

func hashObject(obj *pb.Object) uint64 {
    // Internal xxh3 sync.Pool for buffer reuse
    hasher := Hasher()
    defer ReleaseHasher(hasher)

    hashString(hasher, obj.Foo)
    hashBar(hasher, obj.Bar)
    return hasher.Sum64()
}

func hashBar(hasher Hasher, bar *pb.Bar) {
  // Long switch-statement to deal with one-of values
}

func NewBigCache() *bigcache.BigCache {
    cache, err := bigcache.New(context.TODO(), bigcache.Config{
        Shards:             1024,
        HardMaxCacheSize:   1024,
        LifeWindow:         time.Minute * 5,
        CleanWindow:        time.Second * 150,
        MaxEntriesInWindow: 150000,
        MaxEntrySize:       2048,
        // Relevant bit
        Hasher: dumbHasher{},
    })
    if err != nil {
        panic(err)
    }
    return cache
}

type dumbHasher struct{}

// Sum64 assumes that 's' is a string-formatted uint64 and panics if not
func (p dumbHasher) Sum64(s string) uint64 {
    u, err := strconv.ParseUint(s, 10, 64)
    if err != nil {
        panic(fmt.Sprintf("invalid argument: cannot convert %q to a uint64", s))
    }
    return u
}
janisz commented 10 months ago

Thank you. Now I understand. We use string as a key just for as it was our historical use case. Now, Go supports generics so maybe we can make it more liberal in what we accept (e.g. accept comparable and also generify the hasher interface). What do you think? This change will be then backward compatible (except requirement for newer go version).

coxley commented 10 months ago

We use string as a key just for as it was our historical use case

But it's uint64 internally isn't it? Or was that a change at some point.

Either way, @janisz, it sounds great to me. :)

janisz commented 10 months ago

But it's uint64 internally isn't it? Or was that a change at some point.

Correct and we should keep it that way. So the only change will be in public methods and hasher. So hasher need to convert T comparable into uint64

I'm not sure if we then can provide defulat hasher easliy. Maybe we need to cut v4 and have BigCache and GenericBigCache

coxley commented 10 months ago

Or have a different package so that the name isn't compromised. We did this with lazylru: