Open marselester opened 11 months ago
While we can certainly work towards improving the usage of strings, I don't see it being the highest contender for further improvements in Parca itself right now: https://pprof.me/a2522ef Having said that, if someone wants to work on this, I don't think anyone is going to be opposed to merging those changes!
Sounds good! It would be better to compare a memory footprint (resident set size) with and without string interning (i.e., storing only one copy of a string) on production where Parca receives lots of duplicate strings (labels, symbols). From what I understand, we won't see any difference in a heap profile; at least I didn't see a difference in benchmarks, but RSS and the size of structs that held interned strings were smaller (measured with github.com/DmitriyVTitov/size).
There are some discussions on adding a package to stdlib https://github.com/golang/go/issues/62483
Have you already considered string interning to reduce memory consumption?
I am aware of several approaches:
map[string]string
cons: it could grow forever[]byte
slices, see https://github.com/go4org/intern/issues/18It looks like Thanos got good results with
go4org/intern
https://github.com/thanos-io/thanos/pull/5926, though I am not sure if it's ok to intern like thatintern.GetByString(s).Get().(string)
https://github.com/go4org/intern/issues/19.See also Optimizing string usage in Go programs slides.