JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.43k stars 5.46k forks source link

Include sampling profiler in julia Base #2597

Closed ViralBShah closed 11 years ago

ViralBShah commented 11 years ago

Include @timholy 's sampling profiler in julia Base. This was the general consensus in #1865 and I am opening a new issue to track this.

timholy commented 11 years ago

I haven't forgotten this. There's at least one thing holding me back: I think to be really useful, sprofile_tree needs graphic visualization. I haven't implemented that yet, but it's on my TODO list.

Until we have graphics in base, I wonder if it's better left in a package.

ViralBShah commented 11 years ago

Ok. I personally would be happy with a text based profiler, and can certainly use it from the package for now. Changing this to the 1.0 milestone.

JeffBezanson commented 11 years ago

Agree; I am happy picking through obscure text output. And a simple rough summary of where time is spent can be quite valuable, even without call tree information.

ViralBShah commented 11 years ago

The sprofile_flat example shown in the SProfile README is good enough for a lot of use cases. I would feel a lot better if this was available in base. The graphical version can come later when we have visualization.

timholy commented 11 years ago

OK, I can push what's available to core in time for 0.2. Sometime later I'll experiment with graphical visualization in the package.

As far as interpreting the results of text-based output from sprofile_tree, the best trick is to store the results to a file and use a text editor's cursor to help follow the indentation. (I had to switch to single-character indentation just because the call tree is often so deeply nested, but that's made it quite hard to parse visually.)

timholy commented 11 years ago

There's a new sprofile branch. Feel free to make changes, and merge into master when you deem it ready. (Or give feedback and I can do the cleanups.)

timholy commented 11 years ago

I should have added: sprofile.c needs a timer implementation for Windows. All it needs to do is the same thing that the Darwin & Linux #if defined functions do.

ViralBShah commented 11 years ago

I was able to compile on mac. Can we call this profile instead of sprofile? The original profile would then need a new name.

ViralBShah commented 11 years ago

Can we use the libuv timers here?

vtjnash commented 11 years ago

@ViralBShah we need fully async timers, which libuv does not provide

I would be very annoyed to have this merged until windows support can be realized (since it tends to cause a bug report headache -- and windows already has enough open issues). I don't think that is feasible for 0.2.

I can probably help some there. However, brief google searching doesn't reveal the existence of any asynchronous timers, so we probably need a helper thread. But more brief google searching hasn't been very fruitful in finding a version of RtlCaptureStackBackTrace which accepts a thread PHANDLE or CONTEXT argument (although gdb is able to pretend to do it, and fail miserably at succeeding)

ViralBShah commented 11 years ago

That is fair enough. It would be great to have this working on Windows, and we can certainly wait for one more release for that.

timholy commented 11 years ago

Didn't realize that timers on Windows would be so problematic, given that it's a dozen lines on Unix. Sorry about that. I'll do some investigating myself.

If necessary this can sit and stew until 0.3.

timholy commented 11 years ago

Wow, it's shockingly hard to find a "signaling timer" on Windows that has better than several-millisecond accuracy.

I can't quite be sure, but I wonder if this might do the trick: http://msdn.microsoft.com/en-us/library/windows/desktop/ms686289(v=vs.85).aspx

But see: http://stackoverflow.com/questions/12199872/resolution-of-waitforsingleobject-with-timesetevent-setwaitabletimer The last reply about migrating between cores is interesting, but I wonder whether that would, say, reduce BLAS to using just one core?

timholy commented 11 years ago

Can we call this profile instead of sprofile? The original profile would then need a new name.

The original profile has already been renamed iprofile for "instrumenting profiler." So this wouldn't be a problem. But I'm a little hesitant because there are in fact quite a few possible types of profilers: http://en.wikipedia.org/wiki/Profiling_(computer_programming)

If, for example, we ever develop a call-graph or hypervisor profiler and want to include that in base (that would basically have to be in base), then having the sampling profiler that's simply called "profile" might be annoying.

timholy commented 11 years ago

A couple more thoughts (then I'm going to be away for a couple days with uncertain access to WiFi):

vtjnash commented 11 years ago

Those timers are possibly correct, but need to be used from a separate thread, as they are not asynchronous in the linux sense. Window's will only call the callback function when the application has gone into a sleep state. It wouldn't mess with openblas in that case to set processor affinity for that thread. But the timers aren't necessarily the critical issue -- I'm not sure how to record a backtrace from the another thread (other than doing code injection).

Announcing 0.2 has a new feature is bound to result in bug reports when people try to copy paste the example code into their repl and it doesn't work. I think they have a hard time accepting that linux might just be better.

vtjnash commented 11 years ago

edit: I think it can be accomplished with StackWalk64 being called from a timer thread

timholy commented 11 years ago

Announcing 0.2 has a new feature is bound to result in bug reports when people try to copy paste the example code into their repl and it doesn't work.

You're absolutely right, of course. Let's just wait until it can be addressed properly. The package exists for people who want to use it.

ViralBShah commented 11 years ago

@vtjnash It seems that 0.2 should be released in the next month or so. So is 0.3 the right milestone?

vtjnash commented 11 years ago

@ViralBShah it still depends on when the windows support can be added to it. also, someone should pick a real date for 0.2 and update all the future milestones appropriately.

On another tangent, anyone know how to suggest features to github? I really want to be able to assign multiple people to issues.

pao commented 11 years ago

@vtjnash Contact support for that--https://github.com/contact

vtjnash commented 11 years ago

The windows backtrace code now uses StackWalk64, so we are half way to having this. Note that StackWalk64 & friends are not thread-safe, so we just need to be careful that we don't try to profile the backtrace.

timholy commented 11 years ago

Exciting! Thanks, Jameson!

ViralBShah commented 11 years ago

Is it possible to include this in Base for 0.2, given the recent progress?

ViralBShah commented 11 years ago

@timholy Trying to gently bring the profiler into Base, I am wondering if as a start, we can at least move the C code for sampling into Base? In the early phases, it may not be compiled on Windows, but it would make installing the profiler package much easier.

timholy commented 11 years ago

I have no objections, but I will defer to @vtjnash on this. Certainly the status quo seems quite satisfactory on Linux, although I guess compilers are less common on other platforms.

vtjnash commented 11 years ago

It's probably easier for me if it is in base -- fewer places to update code for adding windows support. I just asked that it not be part of the advertised features in base until we can get windows support ready.

timholy commented 11 years ago

We can leave the Julia code in the package, that way the advertised state won't be an issue.

ViralBShah commented 11 years ago

Thanks for adding it. Mentally, it does feel like it is easier to make it work on Windows now that it is in Base. Any chance this can happen in the 0.2 release timeframe?

For the record, relevant commit is 159203b937f496485c16d2f1d14be112d3e47354

StefanKarpinski commented 11 years ago

I strongly feel that the sampling profiler should go in base and always be available. It should also be renamed to just @profile and maybe have a little API work done on it, but having it in base seems essential to me.

timholy commented 11 years ago

I was originally reluctant to name it @profile because there are so many possible types of profilers. However, at this point to me this seems like the One True Profiler, and I have difficulty seeing the need for anything else. (Well, a call-graph analyzer would be nice, essentially @which on steroids.)

So I agree that it should probably be renamed. As for merging the Julia code into base, I'm sensitive to @vtjnash's concern that there's still work to be done on the Windows side (high-resolution timers are clearly a major pain on Windows). And since he's such a genius helping make Julia better in so many ways, his word is like, gospel, to me :-).

StefanKarpinski commented 11 years ago

Well, @vtjnash's only objection was announcing this as a feature that's supported on Windows. In fact, he feels it would be easier for him to get the profiler working on Windows if it's part of Base. So I think we should include it and just be very clear that it doesn't yet work on Windows.

ViralBShah commented 11 years ago

That is a useful point. We can merge the sampling profiler into base, but not document it or advertise it until it is ready. It is basically a new feature under development in master, and can be announced when complete.

Let's just call it @profile.

quinnj commented 11 years ago

I'd be happy to help test/work on the 64-bit Windows side if needed. This has been at the top of my wish list as of late :)

timholy commented 11 years ago

Maybe I'm misinterpreting, but I think he's wanting it not to be an advertised feature of any base until the Windows support is ready. They've already worked hard to close a lot of gaps, and this just moves the finish line. So even though I want it in base too, I'm quite sympathetic. It's pretty easy to insert "using Profile" into your .juliarc.jl. (I'm not advocating this as a long-term solution, of course.)

One thing I don't see a reason to wait on, however: @StefanKarpinski, I'd be happy to start working on API changes in preparation for a merge to base. Care to file issues at Profile.jl, so I know what you have in mind? Then it will be ready whenever we decide to pull the trigger.

@karbarcca, certainly any help getting it implemented on Windows is the cleanest solution.

StefanKarpinski commented 11 years ago

Let's just move it into Base and make it clear that it doesn't yet work on Windows. Not having it in Base doesn't make it work on Windows any more. Let's do API changes after that.

vtjnash commented 11 years ago

I'll leave it to @timholy to fix the implementation in the Profile package:

ViralBShah commented 11 years ago

Wow this landed much faster than expected. We can now move the sampling profiler to base!

ViralBShah commented 11 years ago

I am marking this as milestone 0.2, since it may actually be possible. We can move it further out if necessary.

vtjnash commented 11 years ago

I'm not sure it should be 0.2, since we really should be tagging that immediately, if Pkg2 is ready.

ViralBShah commented 11 years ago

This perhaps needs to be discussed elsewhere, but we should ideally go through a couple of release candidates, so that we do not have to scamper around and release updates soon after.

timholy commented 11 years ago

Many thanks, @vtjnash, for writing the Windows timers.

The move of the Julia code is done. Given the question about whether this is arriving in time for 0.2, I pushed it to the new "profile" branch. (I'll delete the sprofile branch shortly.)

Quite a lot of API cleanup, too. See 39cd8dc0dc02a66dcf3c5941f0c7fbff065333e9. @StefanKarpinski, please do let me know of any other issues.

ViralBShah commented 11 years ago

The thing is that until this is in master, it is unlikely to get tested out widely, especially on Windows. Perhaps that only happens after the 0.2 release.

timholy commented 11 years ago

I'm not really in the loop about the plans for the 0.2 release; the Boston/IRC crowd probably knows more. So somebody else had better make a decision about this, sooner better than later.

At least outside of base, it is pretty well-tested (on Unix). Since there are still one or two more things left in Pkg2, I personally think this should be able to make it in. But I won't merge profile->master until someone tells me to.

timholy commented 11 years ago

Merged. With docs, too!

quinnj commented 11 years ago

Awesome! Many thanks to @timholy and @vtjnash. I just gave it a test run on windows 64-bit and it's running great.

Quick question I didn't see mentioned in the docs: If I get a result like the following, what does the "+1" and "+7" mean in the bottom two backtraces?

julia> Profile.print()
4    ...me.jl\src\Datetime.jl; setoffset; line: 120
2    profile.jl; anonymous; line: 9
3    profile.jl; anonymous; line: 14
 662  ...me.jl\src\Datetime.jl; datetime; line: 132
       541 ....jl\src\Datetime.jl; setoffset; line: 120
  10   ...me.jl\src\Datetime.jl; datetime; line: 132
        9 ....jl\src\Datetime.jl; setoffset; line: 120
   1    stream.jl; process_events; line: 481
                2526 none; test; line: 5
                       1   array.jl; splice!; line: 831
                       1   inference.jl; inlineable; line: 1642
                                     +1 3   ...ime.jl; datetime; line: 132
                                     +7 2 ...time.jl; setoffset; line: 120
vtjnash commented 11 years ago

Cool. I like when stuff actually works on windows, without first needing testing and modification (I only wrote it against wine32).

timholy commented 11 years ago

Great to hear! Nice work, Jameson!

If I get a result like the following, what does the "+1" and "+7" mean in the bottom two backtraces?

It's briefly described in the section on the cols option. Looks like I should expand that a bit.

quinnj commented 11 years ago

Ah, got it. I think a quick code snippet showing what you were talking about is all that's needed. Reading through the first time, it didn't click and scanning the code examples I didn't see any examples of that. Great work though, I love having this functionality in Base.