dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.48k stars 4.76k forks source link

Profiling API #4033

Closed gregoryyoung closed 4 years ago

gregoryyoung commented 9 years ago

Is there any timeframe on a profiling API? Are there any public design docs available at this point?

Greg

jkotas commented 9 years ago

ICorProfiler profiling APIs are available in CoreCLR (on Windows, on Unix - not yet). They work same as in the full .NET Framework. The main difference is that the environment variables to setup the profiler have CoreClr_ prefix to avoid colliding with the full .NET Framework (ie CORECLR_ENABLE_PROFILING, CORECLR_PROFILER_PATH for CoreCLR vs. COR_ENABLE_PROFILING, COR_PROFILER_PATH for full .NET Framework).

Are you looking for anything particular?

sawilde commented 9 years ago

Do you know why *_PROFILER_PATH doesn't come in 32 and 64 bit versions - it limits this feature as the processes being profiled could be a combination of both but you can only register one.

jkotas commented 9 years ago

32-bit vs. 64-bit profiler path: It is a good suggestion. Could you please open a separate issue for it?

MattWhilden commented 9 years ago

I've opened dotnet/coreclr#601 to track the suggestion by @sawilde.

@gregoryyoung do you have enough information or are there other specific's you'd like?

MattWhilden commented 9 years ago

Alright, I'm closing this down. Let me know if that was in error.

schourode commented 9 years ago

@jkotas You mention that the Profiling API is already available on Windows, but not yet on Unix. Is it the plan to provide this API (or something similar) on all platforms in the future?

If I am not mistaken, this API is what the guys over at NewRelic are missing in order to instrument .NET apps on non-Windows hosts. For reference: https://docs.newrelic.com/docs/agents/net-agent/getting-started/new-relic-net#app-frameworks

gregoryyoung commented 9 years ago

We are basically in the same boat.

On Thu, Jun 4, 2015 at 5:03 PM, Jørn Schou-Rode notifications@github.com wrote:

@jkotas https://github.com/jkotas You mention that the Profiling API is already available on Windows, but not yet on Unix. Is it the plan to provide this API (or something similar) on all platforms in the future?

If I am not mistaken, this API is what the guys over at NewRelic are missing in order to instrument .NET apps on non-Windows hosts. For reference: https://docs.newrelic.com/docs/agents/net-agent/getting-started/new-relic-net#app-frameworks

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-108908467.

Studying for the Turing test

jkotas commented 9 years ago

This issue should not have been closed. Yes, we should enable the profiling APIs on Unix to support the ecosystem of tools build on top of them going cross-plat. We are aiming for as much parity as possible for the runtime capabilities across platforms.

cc @sergiy-k @mikem8361 @noahfalk

amanda-mitchell commented 9 years ago

I've begun to work on this a bit. You can track my progress (or tell me that I'm doing something terrible) at https://github.com/david-mitchell/coreclr/tree/enable-profiling

gregoryyoung commented 9 years ago

:+1:

On Wed, Jul 15, 2015 at 5:56 PM, David Mitchell notifications@github.com wrote:

I've begun to work on this a bit. You can track my progress (or tell me that I'm doing something terrible) at https://github.com/david-mitchell/coreclr/tree/enable-profiling

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-121678593.

Studying for the Turing test

sawilde commented 9 years ago

:+1:

noahfalk commented 9 years ago

Thanks @david-mitchell ! Last I heard a new coworker here at msft will looking at this area soon (he is still in process of transferring teams). In the meantime feel free to reach out to me for any questions, feedback, PRs, etc.

amanda-mitchell commented 9 years ago

I've got the Profiling API building on OS X (haven't tried Linux yet), and I've created a proof-of-concept profiler at https://github.com/david-mitchell/CoreCLRProfiler

gregoryyoung commented 9 years ago

Just my .02 I actually prefer the mono model on this one. It is far easier to comprehend than transporting the windows model.

amanda-mitchell commented 9 years ago

@gregoryyoung the MS Profiling API has a number of features that are not present in mono's profiler. The New Relic profiler, for example, could not be built on top of Mono's Profiling API as it exists today. (see https://github.com/david-mitchell/NewRelicProfiler for more information on this)

sawilde commented 9 years ago

@david-mitchell - all I see is code, no details on the issues - perhaps you could expand more (even if you put it in the wiki).

As @gregoryyoung says the mono approach does seem much cleaner (once you get past the low documentation footprint) than the windows COM way; all that constant querying via COM interfaces (I know I've done my fair share of it) is awfully tedious.

I'd be interested to know where the gaps were when/if I decide to go ahead on the mono version of OpenCover.

amanda-mitchell commented 9 years ago

The main issue is that the MS version of the API supplies for method rewriting—along with provisions for allocating memory to be used for this purpose and other related functionality—which mono does not.

In any case, the COM API already exists in CoreCLR, and there are advantages to supporting it—for example, porting existing Windows profilers to OS X/Linux should be much easier if a similar API is preserved.

kangaroo commented 9 years ago

Basically this boils down to rejit, which is a big thing for profiler vendors. No?

sawilde commented 9 years ago

Basically this boils down to rejit, which is a big thing for profiler vendors. No?

Nope, looked into it but in the end never really found a need for it - can't talk for the commercial guys but I've yet to see any code in the open-source community that used it outside just exploratory testing. If however is @david-mitchell says there is no way to rewrite a method (I just assumed the profilers in the Open Source space had never needed to and so no obvious code sample) then that is a bit of a downer.

ayende commented 9 years ago

TypeMock does rejit, using the profiler API, IIRC.

Hibernating Rhinos Ltd

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

On Sat, Jul 18, 2015 at 9:34 AM, Shaun Wilde notifications@github.com wrote:

Basically this boils down to rejit, which is a big thing for profiler vendors. No?

Nope, looked into it but in the end never really found a need for it - can't talk for the commercial guys but I've yet to see any code in the open-source community that used it outside just exploratory testing. If however is @david-mitchell https://github.com/david-mitchell says there is no way to rewrite a method (I just assumed the profilers in the Open Source space had never needed to and so no obvious code sample) then that is a bit of a downer.

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-122498114.

mattwarren commented 9 years ago

ReJIT would be used by production profilers, otherwise they can't dynamically add/remove instrumentation on methods. I.e in the exact scenario outlined in this blog post http://blogs.msdn.com/b/davbr/archive/2011/10/10/rejit-limitations-in-net-4-5.aspx

Also from the demos I've seen, I'm pretty sure that private eye by @gregyoung is using it.

On Saturday, 18 July 2015, Shaun Wilde notifications@github.com wrote:

Basically this boils down to rejit, which is a big thing for profiler vendors. No?

Nope, looked into it but in the end never really found a need for it - can't talk for the commercial guys but I've yet to see any code in the open-source community that used it outside just exploratory testing. If however is @david-mitchell https://github.com/david-mitchell says there is no way to rewrite a method (I just assumed the profilers in the Open Source space had never needed to and so no obvious code sample) then that is a bit of a downer.

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-122498114.

mattwarren commented 9 years ago

Sorry that should've been private eye by @gregoryyoung

On Saturday, 18 July 2015, Matt Warren matt.warren@live.co.uk wrote:

ReJIT would be used by production profilers, otherwise they can't dynamically add/remove instrumentation on methods. I.e in the exact scenario outlined in this blog post http://blogs.msdn.com/b/davbr/archive/2011/10/10/rejit-limitations-in-net-4-5.aspx

Also from the demos I've seen, I'm pretty sure that private eye by @gregyoung is using it.

On Saturday, 18 July 2015, Shaun Wilde <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Basically this boils down to rejit, which is a big thing for profiler vendors. No?

Nope, looked into it but in the end never really found a need for it - can't talk for the commercial guys but I've yet to see any code in the open-source community that used it outside just exploratory testing. If however is @david-mitchell https://github.com/david-mitchell says there is no way to rewrite a method (I just assumed the profilers in the Open Source space had never needed to and so no obvious code sample) then that is a bit of a downer.

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-122498114.

gregoryyoung commented 9 years ago

We are not using it at this time. On mono as an example there is a method entered callback already

On Sat, Jul 18, 2015 at 10:46 AM, Matt Warren notifications@github.com wrote:

Sorry that should've been private eye by @gregoryyoung

On Saturday, 18 July 2015, Matt Warren matt.warren@live.co.uk wrote:

ReJIT would be used by production profilers, otherwise they can't dynamically add/remove instrumentation on methods. I.e in the exact scenario outlined in this blog post

http://blogs.msdn.com/b/davbr/archive/2011/10/10/rejit-limitations-in-net-4-5.aspx

Also from the demos I've seen, I'm pretty sure that private eye by @gregyoung is using it.

On Saturday, 18 July 2015, Shaun Wilde <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Basically this boils down to rejit, which is a big thing for profiler vendors. No?

Nope, looked into it but in the end never really found a need for it - can't talk for the commercial guys but I've yet to see any code in the open-source community that used it outside just exploratory testing. If however is @david-mitchell https://github.com/david-mitchell says there is no way to rewrite a method (I just assumed the profilers in the Open Source space had never needed to and so no obvious code sample) then that is a bit of a downer.

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-122498114.

— Reply to this email directly or view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-122514426.

Studying for the Turing test

noahfalk commented 9 years ago

@gregoryyoung when you mentioned the Mono model was easier, are you referring specifically to the fact that it has a flat C style convention vs. COM interfaces, or other aspects of the API? Depending on what issues you are trying to address a small adapter might be able to expose a mono-like API on top of the existing COM API? I agree with @david-mitchell that the COM API feels like a good place to start the port.

I can say that we've got some existing non-open source folks already using ReJIT / interested in using ReJIT soon. Short of some significant technical obstacle I've always imagined it on our .Net Core roadmap. The jump-stamp implementation that it relies on might not be trivial to port cross platform so I wouldn't be surprised if it was enabled a little later in the overall porting process.

@david-mitchell I took a quick look at your code changes so far. Looks good, My only comment at this point is that Maoni will probably appreciate if we put the FEATURE_TRACE_EVENT ifdefs inside the implementation of the various ETW methods vs. around the call-site. I'm assuming the various platform linkers are all smart enough to eliminate calls to empty methods. I think @brianrob has been identifying a potential ETW substitute on other platforms and a lot of these methods might start looking like:

void EmitEtwEventXYZ
{
    #ifdef FEATURE_WINDOWS_ETW
    // do the right thing for ETW
    #elif FEATURE_LINUX_EVENTS
    // do the right thing for Linux
    // add more options as needed for other OS tracing technologies
    #endif
}
amanda-mitchell commented 9 years ago

@noahfalk, I definitely agree about keeping FEATURE_TRACE_EVENT in one place. I had just begun down that path when I saw your message.

brianrob commented 9 years ago

Agree with @noahfalk that we should keep the #ifdefs in the EmitEvent method. Going forward, we'll plan to create an abstraction layer that will encapsulate the platform specific pieces, so that call sites don't have to know anything about them.

ghost commented 8 years ago

Update on this: @sperling's PR https://github.com/dotnet/coreclr/pull/2520 is merged. From the PR description:

Some functionality is still missing.

  • MovedReferences, MovedReferences2, SurvivingReferences and SurvivingReferences2 will only be called when FEATURE_EVENT_TRACE is defined.
  • Enter, Leave and Tailcall function hooks is not implemented for Unix.
  • DoStackSnapshot is not implemented when PLATFORM_SUPPORTS_SAFE_THREADSUSPEND is not defined.

@sperling, @noahfalk would you please comment on what should be the next steps and area we should be focusing on? Also can dotnet/coreclr#2519 be closed in favor of this one? (also the milestones of both issues are clashing, although they are same; Future vs. 1.0.0-rtm)

noahfalk commented 8 years ago

@jasonwilliams200OK I'm fine to close dotnet/coreclr#2519 in favor of this one. @sperling do you think there is any reason to keep both?

For milestones I think 'future' is the right milestone (at least for when I forsee Microsoft folks getting time on it). Obviously if you guys want to push it along at a faster pace that's great!

As for next steps, I suppose it all depends what your goal is. For anyone who already has a profiling itch to scratch, for example a particular tool they want to port, by all means target fixing the APIs that matter for your tool.

However for anyone who doesn't have a particular profiling itch to scratch and just wanted to make some progress here are a few concrete work items: a) Write a simple xplat profiler test to confirm the very basic APIs are working on different OSes. Even just loading a profiler that prints HelloWorld and exits would be a good start. Many more test cases are possible for probing various profiler APIs that likely already work. b) Investigate what it takes to get the EnterLeaveTailcall family of profiler APIs working, write up some notes at minimum, or keep going to write up some test cases/working implementation. c) Admitedly not very exciting for many, but for those who enjoy spreadsheets creating a big list of all the ICorProfiler/ICorProfilerInfo APis with updateable status would be a good way to help track overall progress.

HTH, -Noah

gregoryyoung commented 8 years ago

Is there a reason why MethodEnter/MethodLeave would be left out/what is the plan on supporting these?

noahfalk commented 8 years ago

@gregoryyoung When you refer to MethodEnter/MethodLeave, I assume you mean ICorProfilerInfo::SetEnterLeaveFunctionHooks? If so there is no plan to leave them out, that is the work I am suggesting should be done in (b) above.

Did you mean some other methods instead? Sorry, I'm not sure I quite followed you : )

gregoryyoung commented 8 years ago

Nope that was the question (can't really imagine not having them). Is there any roadmap on this?

noahfalk commented 8 years ago

At the moment I think the roadmap doesn't have too much detail (in particular I don't have dates to offer if that is what you were hoping for), it is roughly...

.Net Core should get the same profiling goodness that desktop has. If anyone from the community has time to work on it that is fantastic, and otherwise folks from Microsoft will work on it after we've got debugging in a good place. The goal is to port all the profiler APIs as-is, in hopes that profiling tools which worked on full .Net will port smoothly to .Net Core. We are also porting our ETW events, though of course they will have to emitted through other transport mechanisms on the non-Windows platforms.

Fire away if you've got more questions - I might not have answers but I'll do my best.

sperling commented 8 years ago

@jasonwilliams200OK @noahfalk I can't see any reason for keeping dotnet/coreclr#2519 open. Creating a simple xplat HelloWorld test would be the next logical step I think.

gregoryyoung commented 8 years ago

Is this intended to be supported for 1.0? Also method enter/leave hooks etc

mikem8361 commented 8 years ago

@noahfalk, @kspawa can you answer his question?

dotnetjt commented 8 years ago

I'd also love to know about ELT hooks for Unix/OSx

mkborg commented 8 years ago

@Dmitri-Botcharnikov FYI

gregoryyoung commented 8 years ago

ping?

kspawa commented 8 years ago

@gregoryyoung, @dotnetjt

Apologies for the delayed response! Somehow both me and Noah seem to have missed this one.

As part of the 1.0 release we have enabled and tested a small set of APIs on Windows. The list is below. It is very likely that the other APIs may also work and you are welcome to try them out and use them if they are working for you, however we have not got to testing the remaining APIs yet. The plan beyond 1.0 is to ensure the rest of the APIs on Windows work and enable them beyond windows to Linux/OsX in a staged manner. Please continue listing the ones that are of interest to you as that may help us prioritize the work better. Thanks for your inputs!

APIs currently tested on Windows (The plan is to get this list checked into a document on github, so that it exists as a living document and people could refer/modify it based on what they test is working.):

ICorProfilerCallback: Initialize ModuleLoadFinished ModuleUnloadStarted ModuleUnloadFinished ModuleAttachedToAssembly JITCompilationStarted

ICorProfilerInfo: GetModuleInfo GetModuleMetaData GetModuleInfo GetModuleInfo2 GetModuleMetaData GetFunctionInfo SetILFunctionBody SetILInstrumentedCodeMap GetILFunctionBodyAllocator GetRuntimeInformation SetEventMask

The flags tested for SetEventMask are: COR_PRF_MONITOR_MODULE_LOADS COR_PRF_MONITOR_JIT_COMPILATION COR_PRF_DISABLE_INLINING

dotnetjt commented 8 years ago

Thanks for the info. SetEnterLeaveFunctionHooks seem to work for us n Windows (and we do get the hooks) but I was specifically curious about non-Windows. Is it the same case of "try it and see" right now?

noahfalk commented 8 years ago

Sorry from me as well for the late response, I also dropped the ball on this one. Yeah we haven't run any profiling tests on Mac/Linux yet within the CLR team at Msft, though I know some of the other community members have. I don't know if any of them tested the ELT APIs specifically though. Once we get a little page set up that says what is being officially tested in the office here perhaps we should add a section where others can comment about what they have tested? Happy to get feedback on that.

dotnetjt commented 8 years ago

I'd be happy to share with you all what we have tested on Windows. (Haven't found anything that doesn't work yet). Heading down the non-Windows path now and trying to just baseline so I don't spend too many cycles on pulling my hair out wondering if it's my code or the framework. 😀

mjsabby commented 8 years ago

I can confirm that ELT will not work because we haven't done the work in the JIT to support it on Linux or MacOSX.

kspawa commented 8 years ago

This document contains the running list of the status of profiling apis. Please add any that you have tested and know work/do not work for the benefit of others. Thanks!

gregoryyoung commented 8 years ago

I would like to express my concerns that this will actually work in RTM. It is not for me what I would be expecting in a RTM (maybe a RC). I have a hard time discussing that we should invest in supporting the API at this point.

noahfalk commented 8 years ago

@gregoryyoung - Thanks for the feedback! Do you have a thought about what would help address your concerns? I'm not sure if you are concerned that not enough APIs are supported yet, or you worry that even the supported ones may not work, or perhaps something else?

I'll certainly admit that supporting the entire profiling API surface is targeted for the 'future' milestone, not RTM. If we were doing this in our old non-OSS way we probably would have said this release doesn't support profiling at all. The fact that some portion of it happens to work and be tested now is simply us trying to be a bit more transparent about how the sausage is made. We hope that kind of info is useful to some folks in the community, but its definitely a new way for us to work and I appreciate you taking the time to give us your feedback so that we can find what works best.

gregoryyoung commented 8 years ago

@noahfalk I am not trying to start drama etc if you prefer we can move the discussion privately.

My concerns basically are:

a) things have not been tested with even a hello world example (this hello world example would also be useful for those of us who have to retarget builds etc). Even in the OSS world that I am normally in this would be prioritized. b) some rather important things are known to not work.

This feels more like a RC1 than a RTM.

I understand the new complications you are facing but it confuses people. We as well are posed with questions of when will you support coreclr and why will you not support their RTM. To be fair saying its not supported explicitly would actually be better in many ways.

noahfalk commented 8 years ago

@gregoryyoung - No worries at all, those sound like perfectly legitimate concerns, not drama : ) You are welcome to email me at noahfalk@microsoft.com too if there is anything you'd prefer to discuss in private.

a) things have not been tested with even a hello world example

In regards to the 'HelloWorld' sample, we've got something in the works. I hope I'm not jinxing myself but aiming to alleviate this one very very soon (Monday?). Its actually a sample that goes a bit beyond HelloWorld.

b) some rather important things are known to not work. This feels more like a RC1 than a RTM.

Agreed that some important profiler features won't work at RTM. There is always some tension between shipping earlier with fewer features, or waiting longer and having more features. I think most of our .NET developers aren't as concerned about profiling features if that means we will RTM sooner and that is the way we chose to go. I know its disapointing for your specific work, but if you also think we made a bad choice considering the .Net developer ecosystem at large I hope you will call us on it.

I understand the new complications you are facing but it confuses people. We as well are posed with questions of when will you support coreclr and why will you not support their RTM.

I hope we can de-confuse this as much as possible. Our shared challenge is what is the least confusing thing we could tell you in the face of large uncertainties? On the topic of 'when can we support coreclr', I assume that means when would coreclr get profiler API support. Largely that was a question of when would we be finished with all the work we thought was higher priority and who would be available to work on it. In my head I made some predictions at various points over the last year. All my predictions turned out to be wrong ; ) Mostly I didn't share them and I think that it was less confusing that way? I see at one point up above I did mention how we'd be getting a new hire soon who could start working on the profiler. It turned out we never got that guy, and then when we later did hire someone else priorities had shifted a bit so he wound up working on something else.

In terms of 'why will you not support their RTM', thats largely a judgement of priorities. We thought most of our developers would prefer a sooner release without profiling than a later release with it. I see back in February I mentioned that I thought 'Future' and not 'RTM' was the right milestone but I didn't talk about why. Perhaps it would have been better if I had talked about this priority choice that is implicitly behind the milestone decision?

dotnetjt commented 8 years ago

I'd love to add a little to the conversation based on your feedback above @noahfalk (thanks for taking the time to contribute that reply).

If you look at the history of .Net all the way back the 1.0 days, one of the big differences from then to now is the large ecosystem of tools that have been developed around the profiling api, debugger apis, reflection, the JIT compiler, etc. Those tools make all the difference in a developers' daily life.

All that said - I run a company that has a commercial CLR Profiler. It's important to me because our customers are asking for the support (and have been for nearly 6 months). It's only logical. Developers love new bits; and have quickly deployed .Net Core to Linux - in production - before realizing that they are still capable of writing code that performs sub-par, and they need tools to help them work through those issues.

So, I somewhat agree with @gregoryyoung in that I feel an RTM should have great tooling support in order to keep the standard that has been set.

As I've mentioned earlier, we've been running our profiler (windows) against .Net core for some time and haven't found anything that doesn't work. Now that you have this doc out there, I can add to it, and if there are any specific test cases you'd like to have run, let me know.

I'm also happy to contribute to getting it working on non-windows, just let me know where to begin and provide me a crash course. I'm well versed in consuming the API but wouldn't know where to start with creating the missing pieces.

Feel free to reach out to me at jtaylor@stackify.com if you'd like.

sawilde commented 8 years ago

As the lead on OpenCover (an open source project that uses the API and is used by the .NET team) I am also frequently asked when I will support .NET core on a Linux platform, best I can tell them is to use PCL and run it on a windows VM for now; I assume that is how the .NET team is currently running OpenCover.