tshort / StaticCompiler.jl

Compiles Julia code to a standalone library (experimental)
Other
500 stars 30 forks source link

New `compile` interface #53

Closed MasonProtter closed 2 years ago

MasonProtter commented 2 years ago

Okay, so the backend is staying the same, but here is my proposal for making this package both safer and easier to use. I've made a compile function which is easy to use, and automatically creates a callable object which knows exactly what types it should be receiving. This object gets serialized, and then can be later loaded and called.

julia> using StaticCompiler

julia> fib(n) = n <= 1 ? n : fib(n - 1) + fib(n - 2)
fib (generic function with 1 method)

julia> fib_compiled, path = compile(fib, Tuple{Int}, "fib")
(f = fib(::Int64) :: Int64, path = "fib")

julia> fib_compiled(10)
55

Now we can quit this session and load a new one where fib is not defined:

julia> using StaticCompiler

julia> fib
ERROR: UndefVarError: fib not defined

julia> fib_compiled = load_function("fib.cjl")
fib(::Int64) :: Int64

julia> fib_compiled(10)
55

compile also does a few things to save naive julia programmers from some scary pitfalls. Importantly, if you were to try and send a Tuple back and forth from a function compiled with generate_shlib you would get some wacky behaviour, because the compiled function expects a pointer to the Tuple, not the Tuple's bits itself. And returning the Tuple would return a pointer to it.

So instead, compile will automatically wrap the input function like so:

f_wrap!(out::Ref, args::Ref{<:Tuple}) = (out[] = f(args[]...); nothing)

and compile that instead. I.e. we are going to be passing LLVM a reference to our input arguments, and we'll be providing a place for the output to be stored, rather than explicitly returning it. I do this like so:

function (f::StaticCompiledFunction{rt, tt})(args...) where {rt, tt}
    Tuple{typeof.(args)...} == tt || error("Input types don't match compiled target $((tt.parameters...,)). Got arguments of type $(typeof.(args))")
    out = RefValue{rt}()
    refargs = Ref(args)
    GC.@preserver out refargs begin
        ccall(f.ptr, Nothing, (Ref{rt}, Ref{tt}), out, refargs)
    end
    out[]
end

These refs are able to be stack allocated here, so there shouldn't be any performance loss due to the creation of these refs:

julia> using StaticCompiler

julia> fib(n) = n <= 1 ? n : fib(n - 1) + fib(n - 2)
fib (generic function with 1 method)

julia> fib_compiled, _ = compile(fib, (Int,));

julia> @btime $fib_compiled(10)
  248.544 ns (0 allocations: 0 bytes)
55

julia> @btime fib(10)
  245.481 ns (0 allocations: 0 bytes)
55

I think one thing that might be a good next step from here would be to have a lightweight package that can supply load_function, so that people can load up a compiled function and use it without having to load all of GPUCompiler.jl and our other deps.

Any thoughts?

MasonProtter commented 2 years ago

Also compile will try to save you from some common errors. E.g. abstract arguments:

julia> foo(u::Tuple) = 2 .* reverse(u) .- 1

julia> compile(foo, (Tuple{Integer, Integer},))
ERROR: input type signature (Tuple{Integer, Integer},) is not concrete

type instability:

julia> bar(x) = x < 0 ? 1 : 1.0
bar (generic function with 1 method)

julia> compile(bar, (Int,))
ERROR: bar on (Int64,) did not infer to a concrete type. Got Union{Float64, Int64}

and passing arguments of a different type from the one you compiled:

julia> compile(foo, (NTuple{3, Int},))[1]((1.0, 2.0, 3.0))
ERROR: Input types don't match compiled target (Tuple{Int64, Int64, Int64},). Got arguments of type (Tuple{Float64, Float64, Float64},)
MasonProtter commented 2 years ago

Also, I just learned how to use executible products. Very cool. Now we use Clang_jll to do the compilation rather than hoping the user has gcc in their PATH.

tshort commented 2 years ago

You're on a roll, @MasonProtter!

For the new API, is this a good time to consider an API for compiling multiple methods into a shared library? It looks like that has to be done at link time. I don't see any support in GPUCompiler.compile for compiling multiple methods.

MasonProtter commented 2 years ago

For the new API, is this a good time to consider an API for compiling multiple methods into a shared library? It looks like that has to be done at link time. I don't see any support in GPUCompiler.compile for compiling multiple methods.

Great point, I didn't really think of that. What do you think the API for this should look like?

jpsamaroo commented 2 years ago

You might need some changes to GPUCompiler for this, but see: https://github.com/JuliaGPU/GPUCompiler.jl/blob/master/src/jlgen.jl#L359-L365. Notice that we pass a Vector{MethodInstance}; the C API already allows us to pass multiple MethodInstances and have them compiled together (and they'll then share globals).

tshort commented 2 years ago

Thanks, Julian.

As far as an API, passing an array of tuples with (function, tupletypes, name) would work. Using a structure for that might make it easier to default the name.

MasonProtter commented 2 years ago

@tshort is it okay if I merge this for now? I think support for adding a list of methods to be compiled can be added later without needing to break this API.

tshort commented 2 years ago

Sure. Merge away, @MasonProtter!