aviatesk / JET.jl

An experimental code analyzer for Julia. No need for additional type annotations.
https://aviatesk.github.io/JET.jl/dev/
MIT License
731 stars 29 forks source link

Proposal: Automatically test generic methods with concrete types #383

Open jakobnissen opened 2 years ago

jakobnissen commented 2 years ago

I have an idea on how to make JET easier to use, which is sort of nonstandard but might be very useful.

A major drawback of JET at the moment is that it mostly needs concrete call sites in order to pick a method instance with concrete types to analyze among the many possible method instances which a generic method makes possible. This means that when analyzing a code base from definitions, the large majority of methods might not get analyzed.

However, in practice, I find that some concrete types dominate: Int is the prototypical Integer, String the prototypical AbstractString, Vector the prototypical AbstractVector etc.

So my idea is: When analyzing a method like foo(x::AbstractString, y::AbstractVector{<:Integer}), JET could test out foo(x::String, y::Vector{Int}). After all, the signature of foo promises that this set of concrete types should work! Of course it won't catch all potential bugs, but it would be tremendously useful.

Now, how does one get a concrete type given an abstract type? JET could have a hardcoded list of "common examples". When encountering a new abstract type AbstractFoo, it could check subtypes(AbstractFoo) and pick a concrete method.

jakobnissen commented 1 year ago

I tried to implement a PR for this and it and it was a total nightmare to figure out generic types for nested parametric types with complicated type bounds, but probably you can do a better job.

I guess one "solution" is just to not cover the hard cases and only test functions where all arguments are fairly straightforward.

timholy commented 1 year ago

The combination of MethodAnalysis.jl and JET might make this pretty easy:

  1. run the package's test suite (this should generate specializations that actually work). You have to use TestEnv for this because you need your session to hang around after the test suite finishes.
  2. for a package function you want to analyze with JET, do mis = methodinstances(somefunction). This collects all the specializaitons of somefunction that were created while running the test suite.
  3. run report_call(mi.specTypes) for each item mi in mis.

You may know that JET stops its analysis at runtime-dispatch boundaries. The SnoopCompile integration can help bridge those dispatches, but only in a fresh session. The means that you have to run the methodinstances analysis, print out the mi.specTypes for each item in mis, and then start a fresh session for each one. Kinda a pain. It might be nice to give JET the ability to trace through runtime dispatches, but that would likely require that we modify Julia to snoop on jl_apply_generic. An easier alternative might be to provide some JET/MethodAnalysis integration, and have JET look for all generated specializations that could apply, and then analyze each.

jakobnissen commented 1 year ago

That's brilliant!

It's not completely clear to be that the user necessarily want static analysis to cross dynamic dispatch. In my experience, dynamic dispatch is almost always a sign of a problem in my code, meaning that any static problem on the other side of dynamic dispatch is usually a false positive (or irrelevant, once the type instability has been fixed). In that sense, a kind of combined report_opt/report_call reporting which warns against both dynamic dispatch and type errors is actually the most useful thing from my point of view.

Furthermore, when SnoopCompile crosses dynamic dispatch, it really only shows one possible outcome of the dispatch. The "known path" taken during the test is already covered in the test cases anyway (not completely of course, JET could find an uncovered edge case on the other side of a dynamic dispatch with SnoopCompile's help). What I mean is that I doubt that JET could bring very much to the table when running on tests in the presence dynamic dispatch.

Then again, I don't have experience using strategic de-specialization to reduce compile times, which would of course add these type instabilities in sound code.