Closed NHDaly closed 3 years ago
Early pass at SnoopI per-method-instance deep inference profile:
This PR is still in draft stage, but i'm trying to clean it up now.
foo(::Integer)
, foo(::Int)
, etc.pprof
itself to (at least have on option to) not modify the function names for Julia functions. Also, the current function regexes are unable to handle parentheses around the function name (e.g. (::MyStruct)(::Int)
), so we definitely need a PR to fix that.to_flamegraph()
and flatten_times()
I would like to add deep snooping functionality to CompileBot.jl. I assume that all the configurations/inputs are the same as normal @snoopi, right?
I would like to add deep snooping functionality to CompileBot.jl. I assume that all the configurations/inputs are the same as normal @Snoopi, right?
I would say they're similar. The commands
aspect should be the same, yeah:
https://github.com/timholy/SnoopCompile.jl/blob/1b6314962e3bb8faced5c15703e1a6fc2e1d2cf6/SnoopCompileCore/src/snoopi.jl#L123
And the tmin
param can be passed to SnoopCompile.flatten_times()
and SnoopCompile.to_flamegraph()
instead of directly to @snoopi_deep
.
I'm supportive of your TODO list. I like the idea of fixing pprof and keeping the nicer printing. The last item (aggregation) sounds like it could also be added in a second PR, if you prefer (I'm fine with either).
I'm supportive of your TODO list. I like the idea of fixing pprof and keeping the nicer printing.
Okay! Good news, the naming weirdness is settled in pprof-land:
I've asked about fixing this in google/pprof
, and in the process I realized how we can work around it in PProf.jl
: https://github.com/JuliaPerf/PProf.jl/pull/31
So I've updated this PR to use the MethodInstance name, and also filled out the StackFrame with more info. :)
The last item (aggregation) sounds like it could also be added in a second PR, if you prefer (I'm fine with either).
Yeah, i like the idea of doing this in a follow-up! Good idea.
Okay so I think this is looking good to go! :) I think this is ready to review.
I've updated to "Ready to Review". Thanks again for your help!
(and i've renamed the PR now that it's ready to go!)
You could add fail-fast: false
to CI to allow the CI run on errors.
Excited to see your progress, as you know I'm already benefiting from your work! I will try to review in the morning.
One thing I know I'm going to want, but not sure yet the best way to implement it, is to discover "who is calling the head of the next inference?" (head = root.children[i]
). The reason I'm interested is that a break in the tree structure (starting a new head) indicates either toplevel or runtime dispatch. Sometimes there seem to be backedges but often not, so it's not entirely trivial short of inserting @show stacktrace(backtrace())
into the callee. Currently, the best strategy I've devised is to collect all the leaves of the previous children and then run findcallers
on them. Any better ideas? I'm happy to implement this part myself, I don't mean to suggest you need to do it for this PR.
I'm already benefiting from your work!
🎉
"who is calling the head of the next inference?"
Yeah, interesting!
Starting a new top-level caller would happen (from what i understand) at runtime, after we've finished compiling and are now in the middle of executing, the code from (one of) the previous inference(s). So yeah, they're not connected at all.
And yeah, I expect there wouldn't be backedges set during dynamic dispatch, since you don't need to recompile the caller if the callee is updated (since it will still be a dynamic dispatch).
But hrmm, yeah, interesting. I guess the best way to implement something like this would indeed be to walk the stacktrace at the first call to typeinf
(if the parent is ROOT
), and walk up until you hit something that's not type inference i guess? And then record that as the caller? Interesting idea.
then run
findcallers
on them
Oooh this is cool! I hadn't seen this before. Neat! :)
Could also be something like @snoopi deep=true
(similar to @code_warntype optimize=true
), but maybe they are different enough that it doesn't make sense to have them behind the same macro.
Could also be something like
@snoopi deep=true
(similar to@code_warntype optimize=true
), but maybe they are different enough that it doesn't make sense to have them behind the same macro.
@KristofferC - sorry i never responded to this. I had the same idea at first, but since this returns an entirely different data structure, and is used with entirely different helper functions, it doesn't seem to make sense to me to put them in the same macro.
Alright, tests are passing, and I'm finished applying changes! :)
Please let me know what you think about the to_flamegraph
/ flamegraph
message, and then otherwise I think this is good to go! 🎸
This looks basically ready to go. The one thing I'd still recommend (but not insist on, if you disagree) is making the "public" methods flamegraph
rather than to_flamegraph
. Here are the reasons:
flamegraph
almost as a constructor (there is no FlameGraph
type in FlameGraphs
, because it's just the root node of a tree), and we have constructors that take very diverse inputs and return the same type of object. For example, Array{Int}(undef, 3)
and Array{Int}(1:3)
are very different beasts indeed, yet we call them the same thing because they produce the same output type.If you do agree that it can safely be another method of flamegraph
, perhaps we should even consider having SnoopCompile re-export than name. Alternatively, we could just expect users to say using FlameGraphs
.
Also, PRs like this are a testament to both your own excellent work and Julia's amazing composability: if you deleted all the documentation (no, please don't!) this is a remarkably small PR given the huge gain in functionality. Pretty incredible.
Thanks @timholy for the kind words! ❤️
Agreed regarding how amazing it is that we could add such interesting functionality in such small PRs. 😊
Per your suggestion, I've changed to implement flamegraph()
, and re-exported it. Thanks!!
I believe this is ready for final review!
😁 Thanks! :) I don't have merge permissions, so...... Merge at will! 😁
I don't have merge permissions
You should have gotten an email about this earlier today...let me know if you can't find it and I'll try to resend.
Hehe oh, you're right! 😁 Thanks! :) Amazing. Really appreciate it! ✔️ 🥳
Do you have any preference regarding Merge vs Squash? I'm inclined to Squash because there were a lot of commits here.
Oh, also, I guess I should do a version bump here.
Is this a minor or patch change? (Or major?) I'm inclined to say minor? And do you think it's okay to release both @snoopl
(#135) and @snoopi_deep
(this PR) in the same new version?
Yes to squash, yes to minor bump, yes to bringing both new pieces of functionality out in a single release. Before registering, we might want to work on the Documenter docs a bit to advertise this functionality.
I was going to let you do the honors, @NHDaly, but I'm eager to use this. Thanks again!
Hehe, eep! I'm sorry -- I was excited to do the honors, too, but I felt like it probably needs more documentation, and then I got carried away doing some analyses now that we have these tools instead of writing the docs! 🙈
Thanks for the merge, @timholy! :) I'll try to follow up with docs soon! <3
I may be able to help too. Lots going on so it will be a bit back-burner but it's already helping, so let's at least work with it in master.
This builds on https://github.com/JuliaLang/julia/pull/37749 to expose the information through SnoopCompile.
This adds a macro to SnoopCompileCore
@snoopi_deep
which collects deep timings from Julia's type inference, and adds functions to display the resulting information as a profile.TODO:
InferenceFrameInfo
in juliaThis is a second attempt at https://github.com/timholy/SnoopCompile.jl/pull/138.