parca-dev / parca

Continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability.
https://parca.dev/
Apache License 2.0
4.06k stars 209 forks source link

String interning #3927

Open marselester opened 11 months ago

marselester commented 11 months ago

Have you already considered string interning to reduce memory consumption?

I am aware of several approaches:

It looks like Thanos got good results with go4org/intern https://github.com/thanos-io/thanos/pull/5926, though I am not sure if it's ok to intern like that intern.GetByString(s).Get().(string) https://github.com/go4org/intern/issues/19.

See also Optimizing string usage in Go programs slides.

metalmatze commented 11 months ago

While we can certainly work towards improving the usage of strings, I don't see it being the highest contender for further improvements in Parca itself right now: https://pprof.me/a2522ef Having said that, if someone wants to work on this, I don't think anyone is going to be opposed to merging those changes!

marselester commented 11 months ago

Sounds good! It would be better to compare a memory footprint (resident set size) with and without string interning (i.e., storing only one copy of a string) on production where Parca receives lots of duplicate strings (labels, symbols). From what I understand, we won't see any difference in a heap profile; at least I didn't see a difference in benchmarks, but RSS and the size of structs that held interned strings were smaller (measured with github.com/DmitriyVTitov/size).

Screenshot 2023-10-30 at 18 56 32
kakkoyun commented 10 months ago

There are some discussions on adding a package to stdlib https://github.com/golang/go/issues/62483