JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.77k stars 5.49k forks source link

ScopedValue is allocating when accessed #53584

Open sgaure opened 8 months ago

sgaure commented 8 months ago

I have looked into using scoped values for some temporary arrays to avoid allocations in parallel tasks. However, it seems scoped values are allocating when accessed, whereas with tls it can be avoided. This is unfortunate, since gc in parallel tasks can be a performance problem.

using .Threads
using BenchmarkTools

@noinline function tlsfun()
    tlsvec = get!(() -> [0], task_local_storage(), :myvec)::Vector{Int}
    tlsvec[1] += 1
    return nothing
end

const dynvec = ScopedValue([0])

@noinline function dynfun()
    dvec = dynvec[]
    dvec[1] += 1
    return nothing
end

function tlsrun()
    @sync for _ in 1:nthreads()
        @spawn for _ in 1:100000; tlsfun(); end
    end
end

function dynrun()
    @sync for _ in 1:nthreads()
        @with dynvec=>[0] @spawn for _ in 1:100000; dynfun(); end
    end
end

@btime tlsrun()
@btime dynrun()
versioninfo()

output:

  2.326 ms (202 allocations: 21.03 KiB)
  8.238 ms (2400274 allocations: 36.64 MiB)

Julia Version 1.12.0-DEV.121
Commit bc2212cc0e* (2024-03-04 01:20 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 24 default, 0 interactive, 12 GC (on 24 virtual cores)
Environment:
  JULIA_EDITOR = emacs -nw
topolarity commented 8 months ago

This is probably caused by 5b2fcb68800 and is not multi-threading related:

julia> const dynvec = ScopedValue([0])
julia> @noinline function dynfun()
           dvec = dynvec[]
           dvec[1] += 1
           return nothing
       end
julia> foo() = @with dynvec=>[0] for _ in 1:1000_000; dynfun(); end

julia> @allocated foo()
16000208

The problem is this @noinline which forces a wrapping Tuple{Vector{Int}} to be allocated as a temporary, even though it is immediately unwrapped at every call-site: https://github.com/JuliaLang/julia/blob/58291db09d18f59223edbdc15592ffcf0eb3dcfa/base/dict.jl#L1004

vtjnash commented 8 months ago

So is the problem there that the API wrongly returns an object of type (leaf.val,) instead of Some{V}(leaf.val)?

topolarity commented 8 months ago

Would Some{V}(leaf.val) bypass the need to allocate the temporary here?

vtjnash commented 8 months ago

I guess not. Apparently we do not have calling convention support for Union{Struct, Ghost}, even though we very easily could (we have many variations on it already) and probably should (it is the iteration protocol)