Closed timholy closed 3 years ago
@NHDaly, if we were to generate some sort of "latency report and amelioration plan" for users & developers, here are the pieces I'm envisioning in very rough terms:
compilation_time ∝ method_complexity * number_of_specializations
, and that decreasing either factor can help. This is mostly what #152 was about.@snoopi_deep
we can sometimes find something good to precompile even if it's not the entrance point to inference.dump.c
only stashes MethodInstance
s that either correspond to a method defined in the package or that have a backedge to the package. Identifying "breaks" in the chain of inference, where Julia uses runtime dispatch, will enable precompilation of bigger chunks of code. That's basically what #159 is about.When you get a chance, I would love a review on this PR, and subsequently #159, so we can get the API moving towards completion.
I'm just going to start merging these, likely after taking another pass at reworking them.
I suspect this was superseded by #168, but I'll leave it here for a while at least.
But also almost certainly unnecessary! #168 has more powerful functionality. We need to add docs, but I am still feeling my way forward.
If you've not seen it, check out the new "visualizations.jl"
and runtime_inferencetime
. The aim is to get PGO, but statically rather than dynamically. There are advantages to either, but we can do the static one today...and it might actually be a better fit for the Julia mindset.
Yeah, this isn't really necessary after #168.
module_roots
makes it easier to discover precompile-worthy MethodInstances. The promise of@snoopi_deep
is that it is not restricted to just the entrance calls to inference; consequently, we can find the most-expensive-to-infer calls in a specific module and precompile those calls.We may want to rewrite the entire
parcel
infrastructure around this, but perhaps one step at a time. We might want to create some kind of aggregate report that combines analysis of excessive specialization, inference breaks, and an improved set of precompile statements.