Closed timholy closed 3 years ago
FYI @tlienart, this reduces the number of inference triggers from 73 to 53.
Thanks @timholy! For the sake of posterity (and my education), could you outline what you ran to find these particular ones? (Just standard MethodAnalysis/SnoopCompile stuff I assume?)
Yep, from FranklinParser's test directory it was basically
using SnoopCompile
tinf = @snoopi_deep include("runtests.jl")
itrigs = inference_triggers(tinf)
mtrigs = accumulate_by_source(Method, itrigs)
and then looking at the main "offenders."
Not all inference triggers are bad, but this at least lets you know what you've got. The ones I fixed were pretty trivially fixable without forcing any kind of redesign. There might be others in the same category (I didn't do this analysis on CommonMark's test suite). Others might be a more delicate design decision. Your "types-as-tags" design contributes to a lot of inference failures, but it might be worth it if it keeps your code well organized. I find that https://timholy.github.io/SnoopCompile.jl/stable/pgdsgui/ is probably the most useful tool for thinking about the tradeoffs.
Anyway, the tools are there to help you collect the data and think about the tradeoffs among code style, runtime performance, and compiler latency. You can also explicitly precompile some methods to reduce latency without forcing any kind of redesign, but the big wins usually come from a deeper analysis of the tradeoff involved in specialization. How much analysis you need depends on your goals and how much latency is a problem. Not all cases are like this one.
Thanks for the details Tim.
Your "types-as-tags" design contributes to a lot of inference failures, but it might be worth it if it keeps your code well organized.
Yup, I always figured it would probably do that 😆 does make things organised though, but I'll have a play around with PGDS at some point and see if there's some low-hanging fruit anywhere.
Brief extra: I quickly ran pgdsgui
on your test suite (not really recommended since test suites aren't typically designed to mimic real-world runtime performance demands). Most of the runtime-expensive methods have a red ring around them, which means they spend a lot of time just looking up what method to call due to type-instability. Again, you wouldn't want to use the test suite to make an important design decision about this, but if that result extends to real workload then the best guess is that you might gain both latency reduction and runtime performance improvements by despecializing. It might be as easy as adding some @nospecialize
to some of your arguments if you don't actually need to dispatch on them, or it might require thinking about whether you really need the type system. You can also use manual union-splitting or type annotations.
This makes
rules
a concrete type, allowing things likelength(rules)
to become inferrable. Of course the elements are still not inferrable, but just knowing the container type gets rid of quite a few inference triggers from running FranklinParser's tests.This also changes
priorities
to be anIdDict
rather than aDict
. AnIdDict
is a better choice when each key is of a different type, because it uses a different strategy for hashing and heavily uses@nospecialize
.These changes shave about 0.8s off the time to run your tests.
CC @tlienart