dgraph-io / ristretto

A high performance memory-bound Go cache
https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/
Apache License 2.0
5.53k stars 364 forks source link

[BUG]: ristretto used about 2 x MaxCost memory?? #320

Closed RyouZhang closed 1 month ago

RyouZhang commented 1 year ago

What version of Ristretto are you using?

0.11

What version of Go are you using?

go 1.19.3

Have you tried reproducing the issue with the latest release?

Yes

What is the hardware spec (RAM, CPU, OS)?

4c/8g Linux 5.4.219-126.411.amzn2.x86_64 #1

What steps will reproduce the bug?

package main

import ( "time" "fmt"

ro "github.com/dgraph-io/ristretto"
"github.com/google/uuid"

)

func main() {

ch := make(chan bool)

cache, err := ro.NewCache(&ro.Config{
    NumCounters: 1000000,    // number of keys to track frequency of (10M).
    MaxCost:     1024*1024*100, // maximum cost of cache (1GB).
    BufferItems: 64,         // number of keys per Get buffer.
    Metrics: true,
    IgnoreInternalCost:false,
})
if err != nil {
    panic(err)
}

for i:=0; i<10; i++ {
go func() {
    for {
        raw := make([]byte, 1024*16)
        cache.Set(uuid.NewString(), raw, int64(len(raw)))
        <-time.After(1 * time.Millisecond)
    }

}()
}
go func() {
    for {
        fmt.Println("cost:",(cache.Metrics.CostAdded()- cache.Metrics.CostEvicted())/1024.0/1024.0, "MB item:", cache.Metrics.KeysAdded() - cache.Metrics.KeysEvicted())
        <-time.After(1000 * time.Millisecond)
    }
}()
<-ch

}

Expected behavior and actual result.

in my demo, the max-cost is 100MB, but top your pid, you will find it's used 200MB memory

Additional information

No response

andys commented 1 year ago

You're calling Set() with a cost of 16, but actually storing more than double that, because you are converting UUID from raw bytes to a string ?:

fmt.Println("len = ", len(uuid.NewString()))

len = 36

RyouZhang commented 1 year ago

You're calling Set() with a cost of 16, but actually storing more than double that, because you are converting UUID from raw bytes to a string ?:

fmt.Println("len = ", len(uuid.NewString()))

len = 36

the UUID is key, the real value is 16K, and I don't is the reason, because memory cost 2x

andys commented 1 year ago

@RyouZhang sorry, my mistake. Have you tried going much bigger, like, at 1GB of byte values, does it use 2GB of memory or just 1.1GB?

RyouZhang commented 1 year ago

@RyouZhang sorry, my mistake. Have you tried going much bigger, like, at 1GB of byte values, does it use 2GB of memory or just 1.1GB?

yes, in my prod env, we set 3GB but real used 6+GB

nearsyh commented 1 year ago

After some reseach, I think the behaivor of ristretto is "correct". The in-use heap size is indeed ~100MB. I think the reason why the RES memory in top is higher is that the gc'ed memory is not returned to the OS. If we add debug.FreeOSMemory() in the cost printing loop (with a higher frequency, e.g. 1000ms -> 100ms), we can see a lower RES size.

github-actions[bot] commented 1 month ago

This issue has been stale for 60 days and will be closed automatically in 7 days. Comment to keep it open.