golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.56k stars 17.61k forks source link

runtime: pprof should report non-heap memory #15848

Open aclements opened 8 years ago

aclements commented 8 years ago

pprof currently only reports heap-allocated memory. However, pprof is meant to help with debugging memory footprint and sometimes an application's memory footprint comes from non-heap sources, such as stacks or GC memory (the latter usually indicates a bug such as #15319, but it's currently hard to track down such bugs).

Hence, pprof should report the other sources of memory footprint somehow, such as showing them as isolated boxes or simply showing them in the info box. This information is already recorded in profiles, though it's in "comments".

aclements commented 7 years ago

Over on #19324, @rsc pointed out:

Now that we can emit rich protos for the memory profile, we should make sure that the memory profile includes all memory taken from the system, not just heap memory.

pradeepsawlani commented 7 years ago

Currently Im also seeing similar issue where there is possible leak in stack memory of one particular thread. @aclements you mention in the description "This information is already recorded in profiles, though it's in "comments" Is there way to extract manually this info until this is implemented?

aclements commented 7 years ago

Unfortunately, I think it's no longer recorded in the profiles since we switched to the proto format (@matloob, am I correct that the heap profile no long records the memstats?). But you can always query that information by calling runtime.ReadMemStats yourself.

Though in your case I'm not sure this is exactly what you're looking for. This will tell you if you're using a lot of stack memory, but not why. I hadn't considered this before, but it's possible we could profile stack allocations much like we profile heap allocations, including the call stack that led to the allocation of a new/larger stack.

rsc commented 7 years ago

We meant to add the MemStats to a comment in the proto. I checked with Raul that this was an appropriate use of the comment field. But I don't think we did it. Please file a bug if you'd like to get that added back.

matloob commented 7 years ago

Yes, the debug=0 proto format heap profiles no longer output memstats. But the debug=1 heap profiles should still output them.

pradeepsawlani commented 7 years ago

I have go binary (kubelet) running and from smaps (/proc//smaps) I can see stack memory increasing but I could not map this to which go routine is leading to increased stack usage. I'm go newbie, as far I can understand go runtime maps multiple goroutine to threads. I'm stuck on how to map threadid to goroutine which is allocating memory on stack. @aclements I'm exactly looking for which call stack led to large stack allocation, let me know if you need seperate issue to be filed for this. Until this is implemented, any pointers how to debug? what file should I be looking for stack allocation?

aclements commented 7 years ago

@pradeepsawlani, I don't think the stack memory in smaps is particularly meaningful in Go. At most, that shows system stacks allocated for OS threads. It can't show goroutine stacks because the kernel can't distinguish those from any other heap memory. How large is the stack memory growing? How many OS threads does the process have?

pradeepsawlani commented 7 years ago

I see number of OS threads (cat /proc/smaps | grep -i stack | wc -l) as 110 (after running test case). With One of OS thread allocating ~48MB (Private Dirty) for Stack.

aclements commented 7 years ago

OS thread stacks are usually only allocated to be on the order of 8MB, and the Go runtime uses (and hence faults in) very little of any OS stack. It may be that this is just an inaccurate way to count the number of OS threads. Use ls -1 /proc/$PID/task | wc -l instead.

But this is all getting a bit off-topic for this issue. In a cgo-using binary, the Go runtime has very little control over OS thread allocation. It generally doesn't even know how big they are, so it wouldn't be able to reasonably profile them. It may be interesting for you to look at the existing threadcreate profile, which tells you where the OS threads are coming from.

pradeepsawlani commented 7 years ago

@aclements Yup tried that before but threadcreate endpoint was printing number of OS threads and no stack dump.

ianlancetaylor commented 6 years ago

CC @hyangah

Mahdi-zarei commented 1 year ago

Hello, I have a server running a service with thousands of goroutines, and I am confident there are leaks in the memory, yet pprof does not provide me with any information as it appears that the memory is allocated in stack of some seemingly stuck goroutines. I really need the information to debug the issue, can you please add the stack profiler to pprof so it can be of much greater help in apps with great number of goroutines?