JuliaSIMD / StrideArraysCore.jl

The core AbstractStrideArray type, separated from StrideArrays.jl to avoid circular dependencies.
MIT License
17 stars 8 forks source link

Removing allocations for repeated use of object_and_preserve #4

Open mmiller-max opened 3 years ago

mmiller-max commented 3 years ago

When using CheapThreads.jl, I get allocations when new References are created in object_and_preserve. This makes sense, but I'd really like to do away with this is possible as my aim is to use CheapThreads.jl within DifferentialEquations.jl and I don't want the allocs to build up. In my case the arg that is being referenced is a struct, but you also get allocations for other !isbits types, e.g.:

julia> f1(arg) = @time StrideArraysCore.object_and_preserve(arg)
julia> f1(1)
  0.000000 seconds
(1, nothing)

julia> f1("string")
  0.000000 seconds (1 allocation: 16 bytes)
(StrideArraysCore.Reference{String}("string"), StrideArraysCore.Reference{String}("string"))

julia> struct TestStruct
       a::Vector{Int64}
       end

julia> ts = TestStruct([1])
TestStruct([1])

julia> f1(ts)
  0.000000 seconds (1 allocation: 16 bytes)
(StrideArraysCore.Reference{TestStruct}(TestStruct([1])), StrideArraysCore.Reference{TestStruct}(TestStruct([1])))

I can see two possible solutions:

1. Allow users to create their own References

This just needs object_and_preserve to be extended like

@inline object_and_preserve(r::Reference) = (r, r)

When added, it works like so:

julia> ref = StrideArraysCore.Reference("string")
StrideArraysCore.Reference{String}("string")

julia> f1(ref)
  0.000000 seconds
(StrideArraysCore.Reference{String}("string"), StrideArraysCore.Reference{String}("string"))

julia> ref = StrideArraysCore.Reference(ts)
StrideArraysCore.Reference{TestStruct}(TestStruct([1]))

julia> f1(ref)
  0.000000 seconds
(StrideArraysCore.Reference{TestStruct}(TestStruct([1])), StrideArraysCore.Reference{TestStruct}(TestStruct([1])))

This seems to work fine with batch from CheapThreads.jl.

2. Encourage users to extend object_and_preserve, store! and load for their structs

For example:

@inline function ThreadingUtilities.load(p::Ptr{UInt}, ::Type{TestMutableStruct}, i)
    i, ref = ThreadingUtilities.load(p, Ptr{TestMutableStruct}, i)
    i, Base.unsafe_pointer_to_objref(ref)::TestMutableStruct
end
@inline function ThreadingUtilities.store!(p::Ptr{UInt}, r::TestMutableStruct, i)
    ThreadingUtilities.store!(p + i, reinterpret(UInt, pointer_from_objref(r)))
    i + sizeof(UInt)
end
@inline StrideArraysCore.object_and_preserve(r::TestMutableStruct) = (r, r)

This seems to be fine with CheapThreads, but I don't know if there are implications of doing this.

julia> arg_test(args, start, stop) = nothing

julia> mutable struct TestMutableStruct
       a::String
       end

julia> tms = TestMutableStruct("a")
TestMutableStruct("a")

julia>  u = zeros(100)
...
julia> @time batch(arg_test, (10, 8), u, tms)
  0.000003 seconds

The above extensions only work for mutable structs because of pointer_from_objref.

I'm interested to know what people's thoughts are on the best way forward for this!

chriselrod commented 3 years ago

Why not 1 & 2? I.e., let users decide what works best for their use case.

chriselrod commented 3 years ago

Base.unsafe_pointer_to_objref(ref)::TestMutableStruct

This or something like it seems like a good solution.

julia> x = Ref(("hi", "world", 9))
Base.RefValue{Tuple{String, String, Int64}}(("hi", "world", 9))

julia> p = Base.unsafe_convert(Ptr{typeof(x[])}, x)
Ptr{Tuple{String, String, Int64}} @0x00007f2a74688db0

julia> tsptr(x::Ptr{T}) where {T} = (Base.unsafe_pointer_to_objref(x)::Base.RefValue{T})[]
tsptr (generic function with 1 method)

julia> tsptr(p)
("hi", "world", 9)

julia> @benchmark tsptr($p)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.765 ns (0.00% GC)
  median time:      1.772 ns (0.00% GC)
  mean time:        1.798 ns (0.00% GC)
  maximum time:     12.976 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

I don't know if there is a way to avoid the type instability all together by using something other than Any:

unsafe_pointer_to_objref(x::Ptr) = ccall(:jl_value_ptr, Any, (Ptr{Cvoid},), x)

e.g.

tsptr2(x::Ptr{T}) where {T} = ccall(:jl_value_ptr, Ref{Base.RefValue{T}}, (Ptr{Cvoid},), x)[]
tsptr2(p)

yields

julia> tsptr2(p)
("hi", "world", 9)

julia> @code_warntype tsptr2(p)
MethodInstance for tsptr2(::Ptr{Tuple{String, String, Int64}})
  from tsptr2(x::Ptr{T}) where T in Main at REPL[13]:1
Static Parameters
  T = Tuple{String, String, Int64}
Arguments
  #self#::Core.Const(tsptr2)
  x::Ptr{Tuple{String, String, Int64}}
Body::Tuple{String, String, Int64}
1 ─ %1 = Core.apply_type(Main.Ptr, Main.Cvoid)::Core.Const(Ptr{Nothing})
│   %2 = Base.cconvert(%1, x)::Ptr{Tuple{String, String, Int64}}
│   %3 = Core.apply_type(Main.Ptr, Main.Cvoid)::Core.Const(Ptr{Nothing})
│   %4 = Base.unsafe_convert(%3, %2)::Ptr{Nothing}
│   %5 = $(Expr(:foreigncall, :(:jl_value_ptr), Ref{Base.RefValue{T}}, svec(Ptr{Nothing}), 0, :(:ccall), :(%4), :(%2)))::Base.RefValue{Tuple{String, String, Int64}}
│   %6 = Base.getindex(%5)::Tuple{String, String, Int64}
└──      return %6

It loads a Base.RefValue{T}, which is why we pass Ref{Base.RefValue{T}} as the return type. In your example, it'd be Ref{TestMutableStruct}.

It's basically the same fast.

mmiller-max commented 3 years ago

Sorry for the delay in replying. Yeah enabling both sounds good.

I'm intrigued to see how much of a difference removing the type instability makes to performance - will give it a go.

chriselrod commented 3 years ago

I'm intrigued to see how much of a difference removing the type instability makes to performance - will give it a go.

I saw basically no difference (you can't get much faster than 1.8 ns), but would do it on principle anyway. Would be easier on the branch predictor.

mmiller-max commented 3 years ago

Yep also saw no difference for structs as you expected.

For mutable structs, tsptr doesn't work:

julia> mutable struct TestMutableStruct
       a::String
       end

julia> tms = TestMutableStruct("a")
TestMutableStruct("a")

julia> ref = Ref(tms)
Base.RefValue{TestMutableStruct}(TestMutableStruct("a"))

julia> p = Base.unsafe_convert(Ptr{typeof(ref[])}, ref)
Ptr{TestMutableStruct} @0x000000012a49d140

julia> Base.unsafe_pointer_to_objref(p)
TestMutableStruct("a")

julia> tsptr(p)
ERROR: TypeError: in typeassert, expected Base.RefValue{TestMutableStruct}, got a value of type TestMutableStruct
Stacktrace:
 [1] tsptr(::Ptr{TestMutableStruct}) at ./REPL[21]:1
 [2] top-level scope at REPL[22]:1

Is this expected? Seems that :jl_value_ptr returns the mutable struct in this case.

chriselrod commented 3 years ago
julia> tsptr2(x::Ptr{T}) where {T} = ccall(:jl_value_ptr, Ref{TestMutableStruct}, (Ptr{Cvoid},), x)
tsptr2 (generic function with 1 method)

julia> tsptr2(p)
TestMutableStruct("a")

You need to specify Ref{M}, where M is the mutable struct wrapper type. In my example, that was Base.RefValue{T}, because I wrapped the type in a Base.RefValue:

julia> x = Ref(("hi", "world", 9))
Base.RefValue{Tuple{String, String, Int64}}(("hi", "world", 9))

But now in your example, M == TestMutableStruct, because you're wrapping with TestMutableStruct instead of by calling Ref (which creases a Base.RefValue`).