Closed NHDaly closed 3 years ago
I'm quite sure that I haven't gone about this in the best way. I'd love any suggestions or help if you are interested!!!! :) âĪïļ !
Hopefully this will prove useful for others as well!
Awesome that you're doing this. If I don't notice when you turn this into a real PR, just ping me.
Thank you @timholy!!
After some more thinking, together with @sacha0, i think we think that the approach we'd taken is for is probably not the best way to go about measuring this.
Our next idea is to try a bit more like the other approaches, and just find a better place to inject the timings. The reason i didn't think timings would work when i first looked at this, is that each function encapsulates the time for each subsequent function that it measures, so the timings aren't useful.
But then we realized if we were just able to report the timings in a nested structure, like a profile, they'd make sense! And then i remembered this is exactly what https://github.com/KristofferC/TimerOutputs.jl already does! ð So i think either duplicating that, or literally just using that, would get us what we want! As a first quick test, I just slapped TimerOutputs inside the compiler, and in this small test it seems indeed reasonable:
julia> module M
i(x) = x+5
i2(x) = x+2
h(a::Array) = i2(a[1]::Integer) + i(a[1]::Integer) + 2
g(y::Integer, x) = h(Any[y]) + Int(x)
end;
WARNING: replacing module M.
julia> _to_ = TimerOutputs.TimerOutput();
julia> @code_typed M.g(2,3); # Measure inference timee
julia> _to_
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Time Allocations
ââââââââââââââââââââââ âââââââââââââââââââââââ
Tot / % measured: 9.65s / 0.02% 887KiB / 30.7%
Section ncalls time %tot avg alloc %tot avg
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
MethodInstance for g(::Int64, ::Int64) 1 2.16ms 100% 2.16ms 273KiB 100% 273KiB
MethodInstance for h(::Vector{Any}) 1 766Ξs 35.5% 766Ξs 187KiB 68.7% 187KiB
MethodInstance for i2(::Integer) 1 236Ξs 10.9% 236Ξs 42.6KiB 15.6% 42.6KiB
MethodInstance for i(::Integer) 1 178Ξs 8.23% 178Ξs 42.6KiB 15.6% 42.6KiB
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
So i'll give an approach like that a go, next!! ð
(I've filed an issue in TimerOutputs.jl to see about using it to generate essentially profiles, like i want above ð: https://github.com/KristofferC/TimerOutputs.jl/issues/90)
(fwiw, here's the git diff for what I was demoing, above. I'm trying it now on our codebase to see if it reports anything useful!):
diff --git a/base/compiler/typeinfer.jl b/base/compiler/typeinfer.jl
index a0abd97276..a8bec821b5 100644
--- a/base/compiler/typeinfer.jl
+++ b/base/compiler/typeinfer.jl
@@ -9,6 +9,11 @@ function typeinf(interp::AbstractInterpreter, result::InferenceResult, cached::B
end
function typeinf(interp::AbstractInterpreter, frame::InferenceState)
+ Main.TimerOutputs.@timeit Main._to_ Main.string(frame.linfo) begin
+ _typeinf(interp, frame)
+ end
+end
+function _typeinf(interp::AbstractInterpreter, frame::InferenceState)
typeinf_nocycle(interp, frame) || return false # frame is now part of a higher cycle
# with no active ip's, frame is done
frames = frame.callers_in_cycle
(The above of course wouldn't compile, but i'm eval'ing it in after into a running session. If this approach works, i'll likely copy what you did for @snoopi
, and add a ccall to swap out the infer
function to one that's timed. ð)
Cool. Not sure if you know this but Revise.track(Core.Compiler)
may make this easier to test.
Thanks @timholy -- I didn't know that, and I appreciate the tip! ð
Builds on https://github.com/JuliaLang/julia/pull/37535 to expose this information through SnoopCompileCore