Open Thorsieger opened 4 days ago
and today :
Showing nodes accounting for 66.14GB, 94.50% of 70GB total
Dropped 165 nodes (cum <= 0.35GB)
Showing top 10 nodes out of 52
flat flat% sum% cum cum%
20.40GB 29.14% 29.14% 20.40GB 29.14% github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).getExpandedGlobs
16.35GB 23.36% 52.50% 16.35GB 23.36% github.com/go-graphite/protocol/carbonapi_v3_pb.(*FetchRequest).UnmarshalVT
15.97GB 22.81% 75.31% 15.97GB 22.81% strings.(*Builder).grow
4.29GB 6.14% 81.45% 4.29GB 6.14% github.com/go-graphite/go-carbon/carbonserver.(*trieNode).fullPath
2.87GB 4.10% 85.55% 2.87GB 4.10% github.com/go-graphite/go-carbon/carbonserver.newFileNode (inline)
2.22GB 3.17% 88.72% 7.32GB 10.45% github.com/go-graphite/go-carbon/carbonserver.(*trieIndex).insert
1.44GB 2.06% 90.78% 1.44GB 2.06% github.com/go-graphite/go-carbon/carbonserver.(*trieNode).addChild (inline)
1.36GB 1.95% 92.73% 6.24GB 8.92% github.com/go-graphite/go-carbon/carbonserver.(*CarbonserverListener).expandGlobsTrie
0.78GB 1.12% 93.85% 0.78GB 1.12% github.com/go-graphite/go-carbon/carbonserver.(*trieIndex).newDir (inline)
0.45GB 0.65% 94.50% 4.88GB 6.97% github.com/go-graphite/go-carbon/carbonserver.(*trieIndex).query
If you need more information, please ask ;)
Hi @Thorsieger Yes, pprofs is quite convincing - looks like there's memory leak in getExpandedGlobs and polssibly in UnmarshalVT. Need to be investigated. Maybe less noticable for us because we're doing deploy every month, at least.
And looks like big "max-metrics-globbed" is main grow driver too.
Describe the bug I am experiencing a slow but steady memory leak which forces a service restart every week or so.
Logs Memory usage over time on the physical server :
pprof (on one instance) :
Go-carbon Configuration:
Metric retention and aggregation schemas N/A
Simplified query (if applicable) N/A
Additional context I have a graphite infrastructure that handle 2.4M metrics/minutes. The storage part is composed of 4 go-carbon instances behind a carbon-c-relay. This 4 storages nodes are on a single physical server : 32 cpu/512GB ram/NVME storage.
go-carbon version :
ghcr.io/go-graphite/go-carbon:0.17.3
After checking existing issues, I tried both trie and/or trigram for indexes with no effect. I enabled pprof, the output is above.
may be related to #579