Open nsrip-dd opened 6 days ago
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
CC @golang/runtime @felixge
Thanks for the detailed issue! I'm glad this seems localized for now; I agree we should wait and see if this more widespread.
Just wanted to confirm that the GODEBUG=profstackdepth=32
environment setting needs to be done at runtime, not at build time. Correct?
Yes, that's correct
Copying from https://github.com/golang/go/issues/57175#issuecomment-2377500656:
This is theoretically a problem even with 32 frames. i.e., with left
and right
recursive frames, there are 2^32 combinations of those frames. Storing that many unique samples would likely use ~128GB of memory, assuming 1 byte per frame in a sample.
The reproducer above only triggers extreme usage only with a higher stack limit because it has 16 constant leaf frames. That limits to 2^16 combinations with a 32 frame limit, but 2^112 with a 128 frame limit. The production application that encountered this way presumably similar.
Go version
go1.23.1
Output of
go env
in your module/workspace:What did you do?
We upgraded our Go services to Go 1.23.1. All of our services use continuous profiling and have the heap profiler enabled. Go 1.23 increased the default call stack depth for the heap profiler (and others) from 32 frames to 128 frames.
What did you see happen?
We saw a significant increase in memory usage for one of our services, in particular the
/memory/classes/profiling/buckets:bytes
runtime metric:The maximum went from ~50MiB to almost 4GiB, an 80x increase. We also saw a significant increase in the time to serialize the heap profile, from <1 second to over 20 seconds.
We set the environment variable
GODEBUG=profstackdepth=32
to get the old limit, and the profiling bucket memory usage went back down.What did you expect to see?
We were surprised at first to see such a significant memory usage increase. However, the affected program is doing just about the worst-case thing for the heap profiler. It parses complex, deeply-nested XML. This results in a massive number of unique, deep stack traces due to the mutual recursion in the XML parser. And the heap profiler never frees any stack trace it collects, so the cumulative size of the buckets becomes significant as more and more unique stack traces are observed.
See this gist for a (kind of kludgy) example program which sees a 100x increase in bucket size from Go 1.22 to Go 1.23.
I'm mainly filing this issue to document this behavior. Manually setting
GODEBUG=profstackdepth=32
mitigates the issue. I don't think anything necessarily needs to change in the runtime right now, unless this turns out to be a widespread problem.cc @felixge