dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

Profiling API #4033

Closed gregoryyoung closed 4 years ago

gregoryyoung commented 9 years ago

Is there any timeframe on a profiling API? Are there any public design docs available at this point?

Greg

noahfalk commented 8 years ago

Couldn't sleep so time to do a little late night work instead : )

https://github.com/Microsoft/clr-samples now houses our first public xplat profiler sample (huge thanks to @mjsabby for all his help here!). Hopefully this sample hits a few birds with the same stone:

  1. It demonstrates a viable profiler on the current CoreCLR bits that works for Windows, Linux and Mac.
  2. ELT is one of the few APIs we know doesn't currently work, and this sample provides a potential alternative solution for it
  3. It also demonstrates that IL instrumentation works, and gives some additional confidence that tools like OpenCover would be likely to work.

@gregoryyoung - Could you take a look at the sample and let us know if that handles the 'HelloWorld' part of your concerns?

@dotnetjt - Thanks for hopping in! Certainly I agree with you that .Net devs love their tools and releasing the runtime now while saying that a variety of tool support will follow later has been met with mixed opinions. I'll definitely bubble up your feedback to other folks on the team.

In terms of helping out, we'd be glad to have your assistance! @kspawa has started running the profiling dev work day to day and would be the best person to coordinate with, but I think our top priority right now is getting some automated testing in place that works xplat. As you saw ad-hoc exploration suggests most of the APIs are already working, but the key element that prevents us from saying they are supported and adding them to the 'Microsoft tested' section of that support page is a lack of automated xplat tests.

@sawilde - Yeah, I think the BCL guys are using OpenCover on Windows. If you are willing, I'd say take a look at the sample and if you think OpenCover fits roughly in that mold give porting it a shot. I wouldn't be shocked if the profiler APIs largely just work as long as you aren't using ELT, and you stay away from instrumenting framework binaries when ReadyToRun is enabled. If it still looks too risky let us know what would help you get started.

sawilde commented 8 years ago

APIs largely just work as long as you aren't using ELT

Actually we do, the track by test feature uses ELT but I am not sure how many people use that feature.

and you stay away from instrumenting framework binaries when ReadyToRun is enabled.

I'll have to enquire more about that - OpenCover only instruments binaries that a) match a filter and b) we have a PDB file for. We also add cuckoos (methods in another framework class) that we use to get round some security things (and ahem Silverlight support). Though we have a switch to turn that off but may limit the effectiveness of the coverage.

gregoryyoung commented 8 years ago

@noahfalk the samples and having them at the least been run on osx and linux alleviates most of my concerns. Thanks.

dotnetjt commented 8 years ago

@noahfalk thank you. I did run into a bug with this xplat profiler already, but it seems to serve as a good template. I'll address the bug over there, and thank you! I'll also reach out to @kspawa about contributing.

mjsabby commented 8 years ago

@sawilde you can imagine your track by test feature using the simulated ELT as shown in the sample, or you make the cuckoos a multi-purpose concept ... at least for the time being.

My personal preference is to not use ELT in production scenarios to begin with, but if I absolutely need to, I tend to add enter/leave methods in the rewritten CIL.

The main benefit in my opinion is portability: no assembly stubs, just a C++11 compiler. As CoreCLR gets ported to a new platform you don't have you to write or maintain assembly code in your profiler and can be more confident your profiler works. Furthermore, you don't need to wait for porting of the ELT functionality that requires JIT changes, etc.

The drawback is slightly slower perf: memory needs to be allocated for IL of the methods rewritten, and the calling convention of regular ELT doesn't require saving caller-saved registers. Plus it is possible some debugging is impacted if you don't update the IL maps.

noahfalk commented 8 years ago

@gregoryyoung - Great, I'm glad we could help you out!

@dotnetjt - Thanks for checking it out! Any issues you find or PRs you want to submit are appreciated.

@sawilde - In regards to ReadyToRun, if you do want to instrument inside the framework assemblies you should be able to work around this issue in the short term by setting the environment variable COMPLUS_ReadyToRun=0. This will disable the feature in the runtime. The application is probably going to start noticeably slower because the JIT will need to run for the entire framework, but hopefully code coverage scenarios aren't overly perf sensitive.

For the cuckoos I peaked at the OpenCover source and if I followed correctly it looks like you are defining new methods on pre-existing framework types that you will call to. If that is all you do it might work correctly even with ReadyToRun on, but I don't have a definitive answer for you at this point. In the short term you could either test it or turn off ReadyToRun off to stay conservative. As we flesh out the profiler support we'll investigate more deeply the implications of ReadyToRun and determine specifically what will and won't work. Also .Net Core shouldn't be enforcing the Silverlight security transparency checks so the cuckoos may not be necessary at all, but I understand you've got an existing codebase, you probably still want to support Silverlight, and handling everything the same way helps keep your codebase complexity down.

jkotas commented 8 years ago

the short term by setting the environment variable COMPLUS_ReadyToRun=0

Or return COR_PRF_DISABLE_ALL_NGEN_IMAGES from the profiler (it requires a fix that I have made earlier today). Profilers should treat ReadyToRun images same way as they treat NGen images on full framework.

gregoryyoung commented 8 years ago

I just want to ask as a side question how 32 bit vs 64 bit will work with RTM. Will they use the same environment variables etc?

On Sun, Jun 12, 2016 at 2:56 AM, Jan Kotas notifications@github.com wrote:

the short term by setting the environment variable COMPLUS_ReadyToRun=0

Or return COR_PRF_DISABLE_ALL_NGEN_IMAGES from the profiler (it requires a fix that I have made earlier today). Profilers should treat ReadyToRun images same way as they treat NGen images on full framework.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/coreclr/issues/445#issuecomment-225401074, or mute the thread https://github.com/notifications/unsubscribe/AAXRWgxl6-a5XeutltAc3EpaZz5CoVXAks5qK0spgaJpZM4DuzPL .

Studying for the Turing test

dotnetjt commented 8 years ago

Looks like https://github.com/dotnet/coreclr/issues/601 solved it

gregoryyoung commented 8 years ago

@dotnetjt 404

ghost commented 8 years ago

@gregoryyoung dotnet/coreclr#601

twgraham commented 7 years ago

Is anyone able to give an update on the status of this issue? @kspawa I notice that the status doc hasn't been updated for a while.

lt72 commented 7 years ago

We anticipate that .Net Core profiling API will work xplat by mid-Fall. I cannot offer a more precise deadline at this time though.

discostu105 commented 7 years ago

@lt72 Are there any known shortcomings of the profiling api under linux known so far? Or is it just untested?

noahfalk commented 7 years ago

PR dotnet/coreclr#7719 has been doing some known work for ELT that is Linux specific. PR dotnet/coreclr#9298 is handling some rejit related issues with the ReadyToRun format that appears OS-agnostic. Other than those two known areas I think the rest is a lack of testing.

lt72 commented 7 years ago

Mostly just untested, David is working on completing testing already.

ViktorHofer commented 7 years ago

We anticipate that .Net Core profiling API will work xplat by mid-Fall. I cannot offer a more precise deadline at this time though.

Sound like we should move it to a future milestone?

noahfalk commented 7 years ago

Just a PSA since I know this thread has many people interested in profiler APIs, my recent PR dotnet/coreclr#12193 has some profiling changes in it.

Is there interest in creating an informal group where profiler relevant changes can be announced/discussed in the future? I'm not that wise to ways of github but that seems like something we should be able to do if there was interest?

rfrancisco commented 7 years ago

Can you give us a status update of this?

lt72 commented 7 years ago

Profiler API has now been tested on Linux and Windows for x64/x86. ARM32 testing is in progress and going well, with a few known areas to be completed that are currently being worked on. For tracking ARM32 progress, see dotnet/coreclr#14526, dotnet/coreclr#13992, and dotnet/coreclr#13993.

noahfalk commented 7 years ago

FYI I've got a PR out to update the status page: dotnet/coreclr#14644

Separately, I haven't found a good way to create a group that could be used in @ mentions unless all the people belong to the same organization, but it does seem like there was interest in the general idea above. Does anyone have a good suggestion about how to construct something? Currently my best thought would be to create an issue that is dedicated to profiler related announcements and anyone interested could follow it. On the page we could add links to relevant issues and PRs as they appear. The sole purpose of the page would be to generate a notification to followers, all the actual discussion would occur within the issues that were linked. Is that a weird workaround for not being able to group @ mention? Good?

jaredcnance commented 7 years ago

Makes sense. If the goal is an announcements page, I would recommend locking the issue to prevent unofficial conversation. It's similar to the main announcements repo: https://github.com/dotnet/announcements

which may even be a better place for it, I don't know.

lt72 commented 6 years ago

It seems we need to create a dedicated documentation page in CoreCLR repo.

patricksuo commented 6 years ago

@lt72

Profiler API has now been tested on Linux and Windows for x64/x86. ARM32 testing is in progress and going well, with a few known areas to be completed that are currently being worked on. For tracking ARM32 progress, see dotnet/coreclr#14526, dotnet/coreclr#13992, and dotnet/coreclr#13993.

would you like to point me to a minimal Linux x64 profiler, or point me to your test program (for Linux x64)?

is this profiler still compatible with latest dotnet core CLR?

patricksuo commented 6 years ago

We have this profiler with dotnet daily build 2.1.0-preview1-25907-02, recevied the following random segment fault:

(lldb) register read
General Purpose Registers:
       rax = 0x0000000000000003
       rbx = 0x000000000093db50
       rcx = 0x0000000000000000
       rdx = 0x0000000000940390
       rdi = 0x0000000000000000
       rsi = 0x0000000000940390
       rbp = 0x00007fffef072310
       rsp = 0x00007fffef0722e0
        r8 = 0x0000000000940390
        r9 = 0x00007fff480cc950
       r10 = 0x0000000000000000
       r11 = 0x00007ffff6f5cf10
       r12 = 0x00007fff7d6cf548
       r13 = 0x0000000000674ab0
       r14 = 0x0000000000000001
       r15 = 0x0000000000000000
       rip = 0x00007ffff621f99d  libcoreclr.so`EEToProfInterfaceImpl::JITCompilationFinished(unsigned long, int, int) + 93
    rflags = 0x0000000000010206
        cs = 0x0000000000000033
        fs = 0x0000000000000000
        gs = 0x0000000000000000
        ss = 0x000000000000002b
        ds = 0x0000000000000000
        es = 0x0000000000000000
* thread dotnet/coreclr#8: tid = 6263, 0x00007ffff621f99d libcoreclr.so`EEToProfInterfaceImpl::JITCompilationFinished(unsigned long, int, int) + 93, name = 'dotnet', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
  * frame #0: 0x00007ffff621f99d libcoreclr.so`EEToProfInterfaceImpl::JITCompilationFinished(unsigned long, int, int) + 93
    frame dotnet/coreclr#1: 0x00007ffff63574d6 libcoreclr.so`MethodDesc::JitCompileCodeLockedEventWrapper(PrepareCodeConfig*, ListLockEntryBase<NativeCodeVersion>*) + 710
    frame dotnet/coreclr#2: 0x00007ffff6356c90 libcoreclr.so`MethodDesc::JitCompileCode(PrepareCodeConfig*) + 416
    frame dotnet/coreclr#3: 0x00007ffff6356a22 libcoreclr.so`MethodDesc::PrepareILBasedCode(PrepareCodeConfig*) + 162
    frame dotnet/coreclr#4: 0x00007ffff6358ed1 libcoreclr.so`MethodDesc::DoPrestub(MethodTable*) + 993
    frame dotnet/coreclr#5: 0x00007ffff635888d libcoreclr.so`PreStubWorker + 445
    frame dotnet/coreclr#6: 0x00007ffff62da3c4 libcoreclr.so`ThePreStub + 92
    frame dotnet/coreclr#7: 0x00007fff8061478b
    frame dotnet/coreclr#8: 0x00007fff8061446f
    frame dotnet/coreclr#9: 0x00007fff7f2ee0b1
    frame dotnet/coreclr#10: 0x00007fff80613141
    frame dotnet/coreclr#11: 0x00007fff80612fd8
    frame dotnet/coreclr#12: 0x00007fff80612ca3
    frame dotnet/coreclr#13: 0x00007fff7ddb097d
    frame dotnet/coreclr#14: 0x00007fff806125ea
    frame dotnet/coreclr#15: 0x00007ffff62d96df libcoreclr.so`CallDescrWorkerInternal + 124
    frame dotnet/coreclr#16: 0x00007ffff61fce92 libcoreclr.so`DispatchCallSimple(unsigned long*, unsigned int, unsigned long, unsigned int) + 242
    frame dotnet/coreclr#17: 0x00007ffff637a009 libcoreclr.so`RegisterWaitForSingleObjectCallback_Worker(void*) + 201
    frame dotnet/coreclr#18: 0x00007ffff61cf19d libcoreclr.so`ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 413
    frame dotnet/coreclr#19: 0x00007ffff61cf900 libcoreclr.so`ManagedThreadBase::ThreadPool(ADID, void (*)(void*), void*) + 64
    frame dotnet/coreclr#20: 0x00007ffff6379e6a libcoreclr.so`RegisterWaitForSingleObjectCallback(void*, unsigned char) + 266
    frame dotnet/coreclr#21: 0x00007ffff61eceb4 libcoreclr.so`ThreadpoolMgr::AsyncCallbackCompletion(void*) + 356
    frame dotnet/coreclr#22: 0x00007ffff635e756 libcoreclr.so`UnManagedPerAppDomainTPCount::DispatchWorkItem(bool*, bool*) + 470
    frame dotnet/coreclr#23: 0x00007ffff61edcfb libcoreclr.so`ThreadpoolMgr::WorkerThreadStart(void*) + 1211
    frame dotnet/coreclr#24: 0x00007ffff6567652 libcoreclr.so`CorUnix::CPalThread::ThreadEntry(void*) + 306
    frame dotnet/runtime#3858: 0x00007ffff79c4dc5 libpthread.so.0`start_thread + 197
    frame dotnet/runtime#3859: 0x00007ffff6ed273d libc.so.6`__clone + 109
(lldb) sos ClrStack
OS Thread Id: 0x1877 (8)
        Child SP               IP Call Site
00007FFFEF072550 00007ffff621f99d [PrestubMethodFrame: 00007fffef072550] System.Threading.WaitHandle.Dispose(Boolean)
00007FFFEF0726A0 00007FFF8061478B System.Diagnostics.ProcessWaitHandle.Dispose(Boolean)
00007FFFEF072720 00007FFF8061446F System.Threading.WaitHandle.Dispose()
00007FFFEF0727A0 00007FFF7F2EE0B1 System.Diagnostics.Process.StopWatchingForExit()
00007FFFEF072830 00007FFF80613141 System.Diagnostics.Process.CompletionCallback(System.Object, Boolean)
00007FFFEF0728B0 00007FFF80612FD8 System.Threading._ThreadPoolWaitOrTimerCallback.WaitOrTimerCallback_Context(System.Object, Boolean)
00007FFFEF072930 00007FFF80612CA3 System.Threading._ThreadPoolWaitOrTimerCallback.WaitOrTimerCallback_Context_f(System.Object)
00007FFFEF0729B0 00007FFF7DDB097D System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
00007FFFEF072A60 00007FFF806125EA System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(System.Object, Boolean)
00007FFFEF072BA0 00007ffff62d96df [GCFrame: 00007fffef072ba0]
00007FFFEF072C80 00007ffff62d96df [DebuggerU2MCatchHandlerFrame: 00007fffef072c80]
patricksuo commented 6 years ago

and cc @noahfalk . Maybe you can give me some suggestion.

patricksuo commented 6 years ago

Received segfault with dotnet run --no-build. But run thousands of rounds dotnet bin/Debug/netcoreapp2.1/Foooo.dll without a problem.

noahfalk commented 6 years ago

@sillyousu - I opened a separate issue so we avoid enlarging this already very meandering thread.

noahfalk commented 6 years ago

I think the original issues that opened this thread have been resolved so I'm going to close it. If that is not the case we should probably open a more specific issue on any outstanding questions given how much this thread has meandered.

Above I mentioned the idea of creating a profiler group, I think a number of you liked that idea, and the best I've been able to find is creating an issue similar to the announcement repo. I spent a while chatting with a few our PMs whether I should actually use the announcement repo but ultimately came to the conclusion that it wasn't the best choice. Expectations there a bit more formal and most of the people watching it probably don't care about the low level stuff that a profiler writer is interested in. Instead I created issue dotnet/coreclr#15136 and anyone who is interested is welcome to follow it. I hope this works out, but if you have a better suggestion I'm always open to hear them. Thanks!