Closed homily707 closed 1 month ago
before we discuss memory allocators, do you have a repro to examine? Tokenizer config is the biggest memory consumer, and you can release that memory with Tokenizer.Close()
. Are you reusing tokenizer struct or are you creating multiple instances?
As I utilize the tokenizer, I've observed a continuous rise in memory consumption. Based on the discussions in https://github.com/golang/go/issues/53440 and the insights provided by https://dgraph.io/blog/post/manual-memory-management-golang-jemalloc/, it appears that the issue stems from glibc not returning memory to the operating system enough.
Considering this, I'm curious: is there any possibility that tokenizers might be adapted to utilize alternative memory allocators like jemalloc or tcmalloc in the future?