JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.59k stars 5.48k forks source link

Type stable block #10980

Open bpiwowar opened 9 years ago

bpiwowar commented 9 years ago

Given the importance of type stability, I would like to propose a new language construct that would allow semi-static checking of type stable blocks. In the module I am developing, I am frequently using this type of construct

function f(a)
 # Type stability
@stabilize m::Matrix{Float64} = a.m

 # Compute things with m

end
...

I know there are:

  1. The possibility to use a function - but this can be cumbersome when there are many types, and in makes the code harder to read (at least for me)
  2. Tools like Lint or TypeCheck, and the macro @code_warntype that are useful to see/check whether the written code is type stable

but I would really like to have something that prevents code from being instable, e.g. using a "stable block" (I check and this gives a syntax error with julia 0.[34]):

function f(a)
 # Type stability
@stabilize m::Matrix{Float64} = a.m

 stable begin
  # Compute things - compilation error (or warning) if there is some type instability within the code
 end

end
...

Another syntactic possibility would be to use pragmas #7449 (if they make it in the language).

pao commented 9 years ago

A third--and I suspect preferred--syntactic possibility would be to add it as an annotation macro which would insert a :meta node in the AST, like @inline, @simd, and @fastmath:

@error_unstable begin
    ...
end

The hard part of this is the backend--detecting and acting on the type instability.

bpiwowar commented 9 years ago

It would be a good idea to have annotations, this could be extended to enforce stability at a variable level:

@error_unstable a.m::Matrix{Float64}

would mean "from now on, a.m is a matrix of floats.

I have no idea on what it means at a backend level though, but I suspect that at some point it is possible to see instability - since the @code_warntype has somehow access to it.

timholy commented 9 years ago

@bpiwowar, if I were planning to implement this (to be clear: I'm not), I'd start by playing with the following strategy:

@leaftype a b c function myfunc(x,y)
    ...
end

to call pushmeta!(ex, :leaftype, :a, :b, :c) (see http://docs.julialang.org/en/latest/devdocs/meta/); this enforces signals your intent that a, b, and c should all be leaf types. EDIT: without any variable names, you could require that all variables are leaf types.

If it helps, I added some comments to inference.jl in #10691 (which was not merged, but you can look over the PR). If you're hacking on inference.jl, emitting ccalls to jl_ is the best way I know of to figure out what's going on (see http://docs.julialang.org/en/latest/devdocs/debuggingtips/).

StefanKarpinski commented 9 years ago

@bpiwowar you may have linked the wrong issue when you mentioned pragmas. Did you mean https://github.com/JuliaLang/julia/issues/7449?

bpiwowar commented 9 years ago

@StefanKarpinski yes

StefanKarpinski commented 9 years ago

I suspect that a better way to do this might be to add more features to Julia that allow you to introspect properties of functions like type stability and then add assertions after definitions that the properties you want to hold are true. So you'd do something like this:

function f(x,y)
  ...
end

TYPES = [Int, Float64, Rational{BigInt}]

for S in TYPES, T in TYPES
    @assert istypestable(f,(S,T))
end

Part of the issue is that it's not entirely meaningful to ask if a generic function is type stable. Which method? Given what argument types? Humans are pretty good at looking at code and guessing which argument types you might want to apply the function to, but machines aren't so good. Unless a function has concretely typed parameters, it's usually possible to add new types with new behaviors that would make the function type unstable, so the question of type stability without further elaboration is usually not meaningful.

bpiwowar commented 9 years ago

I agree that type stability should not be checked through some static analysis; but I don't want either to hard-code, as you propose, all the cases that should be checked (although this could be a good option for ensuring that critical parts of libraries are as type stable as possible).

What I was proposing was rather that when the function is compiled a warning, or an error, is generated if some part of the function is not type stable. Maybe this is not the right way to program in Julia, but what I do in many functions is:

1) ensuring that the data structures I will be working with have some concrete types (I can only know those at compilation time since they are computed from parameters and/or arguments). This part is not usually type stable 2) then comes a block that should be type stable

At the moment, I am using @code_warntype and look at the line numbers to track down all the type instability issues within the critical block until I do not see any more type unsafe operations (as highlighted by the macro), but I would much prefer some way to say to the JIT compiler: everything here should be type stable, and have more readable messages (saying which part of the code is not type stable).

timholy commented 9 years ago

Rather than focusing on which parts of the code aren't type-stable, it will be far easier (from an implementation perspective) to focus on which variables are not leaf types. When it throws an error, then a human can look at the code to figure out why.

carnaval commented 9 years ago

I guess it's a lost battle but I'm not sure I like the idea of having those assertions in user code. I mean, it's probably good to have tools to solve this type uncertainties when it causes a performance problem, but let's just remember that return Any is a perfectly valid type inference algorithm and sometimes it comes down to this due to heuristics, which could change on any commit to master.

timholy commented 9 years ago

I agree that it's not obvious we should have such things; at the same time, I understand where he's coming from.

Really, I think the best strategy is this: run your code. Happy with the performance? Great! (And move on.) Unhappy with the performance? Profile it. Functions that run more slowly than you expect are probably type-unstable. And there's no point worrying about functions that are type unstable but don't contribute to your run time.

bpiwowar commented 9 years ago

Here I disagree a bit. When I am writing code, I know there are parts that have to be efficient and type stable - and I want to quickly (when I am running experiments) see that there are some problems that could affect speed. If I make a change to the code, I also want to see those warnings as soon as possible. I guess this is also due to the fact that (most) type instability are easy to fix, so (at least to me) it makes sense to track them down.

I would love though having a profiler that would point directly which type instability are the most critical from a cpu/memory perspective.

StefanKarpinski commented 9 years ago

@bpiwowar wrote:

I would like to propose a new language construct that would allow semi-static checking of type stable blocks.

and

What I was proposing was rather that when the function is compiled a warning, or an error, is generated if some part of the function is not type stable.

My point was that type stability cannot be checked statically – it's not even well-defined. Type stability can only be checked when a method is compiled, which can happen at any point in time. Note that this does not happen when your source code is parsed – it happens when the function is called with a particular set of arguments.

timholy commented 9 years ago

The profiler already does do this: look at the results with ProfileView, and focus your attention on the red bars.

But if you really do want this: I gave you a roadmap for how to implement precisely what you're asking for.

carnaval commented 9 years ago

The compiler does have the info of how many generic call it had to emit and, e.g, how many jl_box_XXX it incurred in the generated code. We could store this info somewhere and have a way to ask for some feedback, or "perf diagnostic", from the codegen.

StefanKarpinski commented 9 years ago

The post-mortem diagnostic approach makes more sense since at that point it's meaningful to ask "what things that I ran were not type stable".

bpiwowar commented 9 years ago

@timholy thanks, I will look at both the profiler (last time I tried it was all plain text, with no red bars) and the roadmap (how much time do you think it would take - let's say for you and I will figure out my multiplicative factor)

bpiwowar commented 9 years ago

@StefanKarpinski @carnaval Yes this would definitely be great - I was impressed by the impact it has on speed and memory, so that's why I am tracking those down in parts of code that are speed critical.

EDIT: If you have a roadmap and a time estimation, I will look at it (or it can be a useful reference for somebody wishing to do this)

EDIT: just to know, will this get better with further version of Julia (I mean - will type instability be less a concern) ?

carnaval commented 9 years ago

@StefanKarpinski @carnaval Yes this would definitely be great - I was impressed by the impact it has on speed and memory, so that's why I am tracking those down in parts of code that are speed critical.

Yeah the generated code is pretty much "all or nothing" for now, and there is certainly much we can do to improve the performance of the various dynamic cases. There isn't too much effort in this direction since usually you get excellent performance by simply rewriting it in a way the inference understands.

toivoh commented 9 years ago

When I was trying to get nice SIMD vectorised code, I also wanted a feature along those lines, that would allow me to state my intentions to the compiler and have it tell me where my expectations broke down. I think there's more things than type stability that are of interest. One criterion that I found pretty crucial when writing an inner loop was whether the function became a leaf function, eg free of call instructions. (At the IR/assembly level at least.)

mauro3 commented 9 years ago

Tim meant ProfileView.jl, check it out, it's much easier than just the line view of profile.

timholy commented 9 years ago

It's always nearly impossible to provide an estimate of a coding project that's anywhere near realistic. Just as a joke, I'll say it would take me less than one day to implement that roadmap. It's really only two focused changes: add the @leaftype macro, and insert what might be just a few lines into inference.jl to loop over all variables and check them.

You want to do as much of the development outside of the julia tree as possible, and then move everything in once you have it working. You can grab the type-inferred AST from code_typed and start playing with the representation. See also http://blog.leahhanson.us/julia-introspects.html. I think you can get 95% of this working outside base before moving it in.

But again, no promises that your changes would be merged! Still, doing the actual implementation is often required to get things considered seriously; otherwise it's hard to guess what it will involve, and vaporware is rarely worth reviewers' time.

sebastiang commented 9 years ago

FWIW, type stability is also a useful tool for modeling and correctness analysis, not just performance.

bpiwowar commented 9 years ago

Hi,

I tried to start the implementation just to see if it was an easy task or not. I don't know if I this is a bug, but with

macro leaftype(ex...)
    esc(_leaftype(ex[end], ex[1:(end-1)]...))
end

_leaftype(ex::Expr, symbols::Symbol...) = pushmeta!(ex, :leaftypemeta, symbols...)

@leaftype function f1(x)
    for i in 1:1000 x += 0.5 end
    begin y = 1 end
end

@leaftype function f2(x)
    for i in 1:1000 x += 0.5 end
end

findleaftype(ex::Any) = false

function findleaftype(ex::Expr)
    leaftypemeta, args = popmeta!(ex, :leaftypemeta)
    if leaftypemeta return true end

    for subex in ex.args
        if findleaftype(subex) return true end
    end

    false
end

println("f1: ", findleaftype(@code_typed(f1(1))[1]))
println("f2: ", findleaftype(@code_typed(f2(1))[1]))

I get

f1: false
f2: true

This seems to be because the generated code adds a NewvarNode(:y) expression before the meta node. Is this specific to code_typed or will I get the same kind of problem with inference.jl? In that case, is this a bug?

timholy commented 9 years ago

Yep, it's a bug. See discussion in #11595. You may just want to wait for it to be fixed before tackling this further, or you can locally back out the troublesome commit and then rebase when the fix arrives.

Thanks for looking into this!

GiggleLiu commented 5 years ago

I think recent moves of Julialang towards GPU and TPU programming makes it urgent for type stability assertion. (#30007 )

@StefanKarpinski , you have mentioned several interesting proposals, but I suspect they can be hardly helpful in detecting kernel type instabilities since argument types of kernel functions on devices can not be tested easily (e.g. CuDeviceArray in CUDAnative). But I agree with you that type stability check can only be done at run time, and it is probably hard. I am not a PL theorem specialist, could you please explain a bit why Julia is able to put a "Box" around a variable, but is unable to detect box allocation inside a function?

PetrKryslUCSD commented 5 years ago

I have just proposed something related. Not a static type checking facility, but rather the first time some method gets compiled, the compiler could issue a warning about type instabilities (if that is enabled by the user). It would make type checking much more immediate than it currently is with @code_warntype. https://github.com/JuliaLang/julia/issues/30155

tveldhui commented 5 years ago

Profiling is fine if you have a one-off performance issue you're trying to resolve. It's not practical for detecting regressions. In our system we generate code with inner loops that have to be tight - e.g. no type unstable code or allocations. It's growing into a big system with lots of people hacking on it, and we need a way to automatically monitor for changes that introduce type instability or allocations in performance-critical regions. Something like @assert_type_stable and @assert_no_alloc that would trigger an assertion failure at runtime would be ideal. Then we could enable those assertions for our automated tests, and disable them for production. Automated monitoring of benchmark times is hard because the timings are noisy, and step detection is unreliable and delayed.

timholy commented 5 years ago

Makes sense. If this is super-important, one might consider sponsoring someone at JuliaComputing to make it happen faster. (Note: I do not work for JC, so I am not saying this out of self-interest.)