Multi-modal profiling - Githubissues

jcorbin commented 5 years ago

Problem Statement

From my viewpoint, the biggest thing stopping me from using github.com/pkg/profile is its hard stance on "no more than one kind of profiling at a time". Especially since my usual starting assumption is "show me both a CPU profile and a memory (allocs) profile".

Taking a CPU + memory profile, or a CPU + memory + goroutines + mutexes profile is actually quite a sane thing to do in my viewpoint. Further I do sometimes want to get both a trace and a cpu profile from the same long-running program, rather than go to all the trouble of running it twice.

So with that introduction, let me lay out my understanding of why this is an okay thing to do (conversely, why the restriction on "one at a time" is actually necessary and overly restrictive).

Kinds of Profiling

For the sake of my argument I'm going to focus primarily on the distinction between active and passive profiles.

There are two kinds of active profiles: CPU profiling and execution tracing. Here the cost of instrumentation is so high that they only run for a set amount of time (say take a 10-second CPU profile from a server, or re-running a tool with trace profiling turned on for the duration).

On the other hand there are passive profiles including allocs, block, goroutine, heap, mutex, and threadcreate. Here the instrumentation cost is low enough that these profiles are latently always on; their counts are collected all the time, and can be collected at any point in time.

NOTE this model is of course complicated by the fact that there is a tuning knob for the passive/always-on memory profiler, which pkg/profile is fairly unique in making such a user-facing API-out of (to say nothing of the basically unusable CPU profiler hz tuning knob...)

Common Combinations

Of all those lovely modes of profiling CPU, memory, and tracing are the most generally known, used, and useful; as evidenced by the prevalent flag idioms -cpuprofile FILE, -memprofile FILE, and -trace FILE.

There's really no problem in combining CPU profiling and memory profiling (at normal rates); to the contrary: since the memory profiler is always on any how, you may as well dump it at the end (in the context of an offline tool, which seems to be the use-case that pkg/profile is most suited for). The same goes for any/all of the other passive profilers: they already have their counts just sitting there, you're only doing yourself a harm by not dumping the data.

The most concern comes when combining CPU profiling and tracing. But even there, at least in my experience, any cross-chatter is fairly easy to pick out or compensate for:

in the event trace, you can at least see that a CPU profile was going on, since there will be at least one dedicated goroutine writing it out over some time span (you should even be able to see the os-interrupt timing... but I've not actually tried to do that, and now I'm purely speculating in parenthetical...)
there is real concern when it comes to skewing the CPU profile however, since every trace instrumentation branch is now hot, further inflating (the probably already dominant) impact of runtime functions
in practice, for online servers, I always sequence tracing and CPU profiling for this reason, rather than do them in parallel; however for an offline tool where your profiling its entire lifecycle, there are times when you want to see both concurrently (even if you also re-run it to also get a "pure" trace and cpu profile)

Why I Care

In my view pkg/profile is very close to fully solving the use case of "lifetime profiling for an offline tool / batch task". I'd like to wrap some flag.Value implementation around it and use it as a dependency (maybe even send a pull request for adding the flag convenience).

However in its current form, not being able to take several at once is a bit of a blocker for that use case.

davecheney commented 5 years ago

Thank you for the detailed request. Sadly the answer is no. I encourage you to fork this project and making the changes you want.

urge Othn 11 Aug 2019, at 02:49, Joshua T Corbin notifications@github.com wrote:

Problem Statement

From my viewpoint, the biggest thing stopping me from using github.com/pkg/profile is its hard stance on "no more than one kind of profiling at a time". Especially since my usual starting assumption is "show me both a CPU profile and a memory (allocs) profile".

Taking a CPU + memory profile, or a CPU + memory + goroutines + mutexes profile is actually quite a sane thing to do in my viewpoint. Further I do sometimes want to get both a trace and a cpu profile from the same long-running program, rather than go to all the trouble of running it twice.

So with that introduction, let me lay out my understanding of why this is an okay thing to do (conversely, why the restriction on "one at a time" is actually necessary and overly restrictive).

Kinds of Profiling

For the sake of my argument I'm going to focus primarily on the distinction between active and passive profiles.

There are two kinds of active profiles: CPU profiling and execution tracing. Here the cost of instrumentation is so high that they only run for a set amount of time (say take a 10-second CPU profile from a server, or re-running a tool with trace profiling turned on for the duration).

On the other hand there are passive profiles including allocs, block, goroutine, heap, mutex, and threadcreate. Here the instrumentation cost is low enough that these profiles are latently always on; their counts are collected all the time, and can be collected at any point in time.

NOTE this model is of course complicated by the fact that there is a tuning knob for the passive/always-on memory profiler, which pkg/profile is fairly unique in making such a user-facing API-out of (to say nothing of the basically unusable CPU profiler hz tuning knob...)

Common Combinations

Of all those lovely modes of profiling CPU, memory, and tracing are the most generally known, used, and useful; as evidenced by the prevalent flag idioms -cpuprofile FILE, -memprofile FILE, and -trace FILE.

There's really no problem in combining CPU profiling and memory profiling (at normal rates); to the contrary: since the memory profiler is always on any how, you may as well dump it at the end (in the context of an offline tool, which seems to be the use-case that pkg/profile is most suited for). The same goes for any/all of the other passive profilers: they already have their counts just sitting there, you're only doing yourself a harm by not dumping the data.

The most concern comes when combining CPU profiling and tracing. But even there, at least in my experience, any cross-chatter is fairly easy to pick out or compensate for:

in the event trace, you can at least see that a CPU profile was going on, since there will be at least one dedicated goroutine writing it out over some time span (you should even be able to see the os-interrupt timing... but I've not actually tried to do that, and now I'm purely speculating in parenthetical...) there is real concern when it comes to skewing the CPU profile however, since every trace instrumentation branch is now hot, further inflating (the probably already dominant) impact of runtime functions in practice, for online servers, I always sequence tracing and CPU profiling for this reason, rather than do them in parallel; however for an offline tool where your profiling its entire lifecycle, there are times when you want to see both concurrently (even if you also re-run it to also get a "pure" trace and cpu profile) Why I Care

In my view pkg/profile is very close to fully solving the use case of "lifetime profiling for an offline tool / batch task". I'd like to wrap some flag.Value implementation around it and use it as a dependency (maybe even send a pull request for adding the flag convenience).

However in its current form, not being able to take several at once is a bit of a blocker for that use case.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

jcorbin commented 5 years ago

Thank you for the detailed request. Sadly the answer is no. I encourage you to fork this project and making the changes you want.

Fair enough, and this was pretty much the answer I expected to get btw, but I thought I'd try the Talk option before the Code option ;-)

In the interest of posterity, could you please share your thoughts on why writing out already taken profile counts (from passive profiles as I described above) after active profile stop wouldn't be something that you'd be willing to accept for pkg/profile? I totally get wanting active profiling to be exclusive (e.g. CPU and Trace at least; someone cranking up the memory profiler rate could also count, especially since you have that path carved out as a separate functional option...)

bygui86 commented 4 years ago

@davecheney I'm working on an evolution of your profile package. If you have time, it would be great you to have a look at it :) I'm still working on the testing part, but I committed the implementation right today. Here the link https://github.com/bygui86/profile

Thanks

klauspost commented 4 years ago

@bygui86 Did you abandon it? I don't see any changes in your fork?

bygui86 commented 4 years ago

@klauspost as my changes are pretty extended, I decided to create a complete new module. You can find it here: https://github.com/bygui86/multi-profile Looking forward your feedback!!

If the maintainers of this amazing module want to reach me to discuss the merge our projects, I would be more than happy :)

mmcloughlin commented 3 years ago

I've had a go at this and pushed up work-in-progress to my experimental repo:

https://github.com/mmcloughlin/x/tree/master/profile

I started from scratch but matched the pkg/profile API (nearly completely), but this also supports running multiple profiles at once, and configuration with the idiomatic flags like go test.

Feedback appreciated.

davecheney commented 3 years ago

Hi, please don't take this the wrong way but I'm going to close this issue. It was my mistake for not closing the issue when I made this comment, https://github.com/pkg/profile/issues/46#issuecomment-520180837. My position has not changed, I encourage you to use the source, luke.

mmcloughlin commented 3 years ago

Yes you were very clear on that 😂

I should have clarified my intention. I was considering cleaning up my experimental code and creating github.com/mmcloughlin/profile. But I thought I'd seek feedback and gauge interest first, and people on this issue seem like a good audience.

bygui86 commented 3 years ago

@mmcloughlin I did already a fork of this project

you can find it here https://github.com/bygui86/multi-profile

your feedback is more than welcome!

mmcloughlin commented 3 years ago

@mmcloughlin I did already a fork of this project

you can find it here https://github.com/bygui86/multi-profile

your feedback is more than welcome!

Yes, I saw your project. Two reasons I prefer my approach (other than not-invented-here syndrome):

More closely matches existing pkg/profile API. In particular defer profile.Start(profile.CPUProfile, profile.MemProfile).Stop() would work rather than two defer lines.
Configuration by flags

mmcloughlin commented 3 years ago

Published: https://github.com/mmcloughlin/profile

pkg / profile

Multi-modal profiling #46

Problem Statement

Kinds of Profiling

Common Combinations

Why I Care