brenhinkeller / StaticTools.jl

Enabling StaticCompiler.jl-based compilation of (some) Julia code to standalone native binaries by avoiding GC allocations and llvmcall-ing all the things!
MIT License
167 stars 12 forks source link

allocated inline assertion error for a MallocVector of struct with SVector or NTuple field. #45

Closed Alexander-Barth closed 1 year ago

Alexander-Barth commented 1 year ago

I would like to use StaticTools.jl to compile some julia code to WASM. StaticTools.jl has been really useful for me to use arrays of Float32 or Int32. Thank you for your great work and making it available as a package! But I have some problem with arrays containing SVectors or tuples.

Here is an example:

using StaticArrays
using StaticTools

mutable struct Particle{N,T}
    x::SVector{N,T} # position
    v::SVector{N,T} # velocity
    f::SVector{N,T} # force
    rho::T          # density
    p::T            # pressure
end

p1 = Particle(
    (@SVector [1.f0,2.f0]),
    (@SVector [3.f0,4.f0]),
    (@SVector [5.f0,6.f0]),
    7.f0,
    8.f0)

unsafe_load.(convert(Ptr{Float32},pointer_from_objref(p1)),1:8)
# output
#=
8-element Vector{Float32}:
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0
 8.0
=#

MallocVector{Particle}(undef,(2,))

This produce the following error:

ERROR: LoadError: AssertionError: Base.allocatedinline(T)
Stacktrace:
 [1] MallocArray
   @ ~/.julia/packages/StaticTools/ADZxF/src/mallocarray.jl:112 [inlined]
 [2] MallocVector{Particle}(x::UndefInitializer, dims::Tuple{Int64})
   @ StaticTools ~/.julia/packages/StaticTools/ADZxF/src/mallocarray.jl:129
 [3] top-level scope

I have the same error if I replace the SVectors by NTuples. In both cases, it seems that the all the floats are allocated next to each other in memory.

Maybe there is a reason that this should not be allowed that I am not aware of.

I am using StaticTools v0.8.7 on julia 1.8.5 (Linux).

brenhinkeller commented 1 year ago

Ah interesting -- so

julia> mutable struct Particle{N,T}
           x::SVector{N,T} # position
           v::SVector{N,T} # velocity
           f::SVector{N,T} # force
           rho::T          # density
           p::T            # pressure
       end

julia> Base.allocatedinline(Particle{2,Float64})
false

but your test seems to suggest that perhaps it could be.. The decider on whether it could be made to work or not with malloc/stack arrays may be whether you can Base.unsafe_load and Base.unsafe_store! a Particle{N,T} to a memory location in e.g. an array. Which it looks like perhaps:

julia> p1
Particle{2, Float32}(Float32[1.0, 2.0], Float32[3.0, 4.0], Float32[5.0, 6.0], 7.0f0, 8.0f0)

julia> ptr = Ptr{Particle{2, Float32}}(pointer_from_objref(p1))
Ptr{Particle{2, Float32}} @0x000000010dc92830

julia> Base.unsafe_load(ptr)
Particle{2, Float32}(Float32[1.0, 2.0], Float32[3.0, 4.0], Float32[5.0, 6.0], 7.0f0, 8.0f0)

If that works, then we could replace all the asserts here https://github.com/brenhinkeller/StaticTools.jl/search?q=allocatedinline with just warnings..

brenhinkeller commented 1 year ago

Oh, the other option might be to just define

Base.allocatedinline(::Type{Particle{N,T}}) where {N,T} = Base.allocatedinline(T)

Which AFAICT should be allowed (/not type piracy against Base) as long as you own the Particle type

The warning actually might be tricky because we'd only want it to print at compile time and not run time, so would possibly just have to comment out that assert entirely (which we could try if you want to PR it)

Alexander-Barth commented 1 year ago

Yes, I think that would be a good solution! But for some reason I cannot update the fields. Maybe I miss something?

Base.allocatedinline(::Type{Particle{N,T}}) where {N,T} = Base.allocatedinline(T)

ma = MallocVector{Particle{2,Float32}}(undef,(1,))
ma[1] = p1
ma[1].x = @SVector [10.f0,20.f0]

ma[1].x
# still 1, 2

va = Vector{Particle{2,Float32}}(undef,(1,))
va[1] = p1
va[1].x = @SVector [10.f0,20.f0]

va[1].x
# 10, 20

As far as I can tell ma[1] seems to be a new object:

ma = MallocVector{Particle{2,Float32}}(undef,(1,))
ma[1] = p1
pointer_from_objref(p1) == pointer_from_objref(ma[1])
# returns false

va = Vector{Particle{2,Float32}}(undef,(1,))
va[1] = p1
pointer_from_objref(p1) == pointer_from_objref(va[1])
# returns true
brenhinkeller commented 1 year ago

My first guess would be that the Base.unsafe_load underlying ma[1] may load the Particle into the stack -- so then you just have to write it back to the mallocvector after modifying it, i.e.

julia> ma[1] = p1
Particle{2, Float32}(Float32[1.0, 2.0], Float32[3.0, 4.0], Float32[5.0, 6.0], 7.0f0, 8.0f0)

julia> p = ma[1]
Particle{2, Float32}(Float32[1.0, 2.0], Float32[3.0, 4.0], Float32[5.0, 6.0], 7.0f0, 8.0f0)

julia> p.x = @SVector [10.f0,20.f0]
2-element SVector{2, Float32} with indices SOneTo(2):
 10.0
 20.0

julia> ma[1] = p
Particle{2, Float32}(Float32[10.0, 20.0], Float32[3.0, 4.0], Float32[5.0, 6.0], 7.0f0, 8.0f0)

Another option if performance is key might be to just access that memory directly, e.g.

julia> madata = reinterpret(Float32, ma)
8-element ArrayView{Float32, 1}:
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0
 8.0

julia> madata[1:2] = (10, 20)
(10, 20)

julia> ma
1-element MallocVector{Particle{2, Float32}}:
 Particle{2, Float32}(Float32[10.0, 20.0], Float32[3.0, 4.0], Float32[5.0, 6.0], 7.0f0, 8.0f0)
Alexander-Barth commented 1 year ago

My problem is actually quite similar to the n-body problem as far as I can tell, a similar approach that you are suggesting is used there, for example:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-julia-8.html

In this case (p = ma[1]; dostuff!(p); ma[1] = p), it is not necessary to mark the struct as mutable and StaticTools work right away :-)

Avoiding a mutable struct also likely to be faster anyway as indicated by the comment in the Benchmark code. Also the memory layout for immutable struct is much simpler (ma[2] is guaranteed to be next ma[1] which helps with the interoperability with WASM).

So for me we can close this issue. Thank you for your insights!

brenhinkeller commented 1 year ago

Cool! For sure! Awesome work on https://github.com/Alexander-Barth/FluidSimDemo-WebAssembly BTW!