JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.62k stars 5.48k forks source link

Can `map!(identity, dest, src)` match performance of `copyto!(dest, src)` ? #36237

Open goretkin opened 4 years ago

goretkin commented 4 years ago
julia> using BenchmarkTools

julia> @benchmark copyto!(dest, src) evals=1000 setup=(dest=Array{Int64}(undef, 1000); src=fill(3, 1000))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     77.721 ns (0.00% GC)
  median time:      79.532 ns (0.00% GC)
  mean time:        84.178 ns (0.00% GC)
  maximum time:     234.622 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark map!(identity, dest, src) evals=1000 setup=(dest=Array{Int64}(undef, 1000); src=fill(3, 1000))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     449.736 ns (0.00% GC)
  median time:      462.425 ns (0.00% GC)
  mean time:        492.667 ns (0.00% GC)
  maximum time:     1.491 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

Perhaps map!(::typeof(identity), dest, src) = copyto!(dest, src) should be defined.

Is it reasonable to desire that this performance fall out of some optimization?

jakobnissen commented 4 years ago

As far as I can tell, we could easily optimize map!, but maybe I'm missing something.

Here's the existing method called:

function map!(f::F, dest::AbstractArray, A::AbstractArray) where F
    for (i,j) in zip(eachindex(dest),eachindex(A))
        val = f(@inbounds A[j])
        @inbounds dest[i] = val
    end
    return dest
end

I don't understand why zip is used. It's slow, and furthermore, it's not clear to be if its behaviour is correct. If the source and destination has different indices, the copying still happens? We could use this one:

function map_new!(f, dst::AbstractArray, src::AbstractArray)
    @boundscheck checkbounds(dst, eachindex(src))
    @inbounds for i in eachindex(src)
        dst[i] = f(src[i])
    end
    return dst
end

Do note that that implementation errors if the dst array cannot fit the source elements, whereas current behaviour does not. However, the documentation states:

destination must be at least as large as the first collection.

So erroring here is probably OK (and indeed, not erroring may be a footgun!) On my machine for a 1,000 element integer array:

julia> @btime map!(identity, b, a);
  338.613 ns (0 allocations: 0 bytes)

julia> @btime map_new!(identity, b, a);
  70.396 ns (0 allocations: 0 bytes)

julia> @btime copyto!(b, a);
  78.095 ns (0 allocations: 0 bytes)
pablosanjose commented 4 years ago

According to the docstring, map! should work between arrays with different axes. Your map_new! function would beed to take into account this possibility, for example between a = rand(1:10, 1000) and b = OffsetArray(copy(a), 1001:2000). The zip in map! takes care of this possibility. That said, one could generalize the copyto! code to a map! and have copyto! call map!(identity,...) instead (the opposite of @goretkin's proposal in a way), like

copyto_new!(dest::AbstractArray{T1,N}, src::AbstractArray{T2,N}) where {T1,T2,N} = map_new!(identity, dest, src)

function map_new!(f, dest::AbstractArray{T1,N}, src::AbstractArray{T2,N}) where {T1,T2,N}
    src′ = unalias(dest, src)
    if axes(dest) == axes(src)
        for I in eachindex(IndexStyle(src′,dest), src′)
            @inbounds dest[I] = f(src′[I])
        end
    else
        isrc = eachindex(IndexLinear(), src)
        idest = eachindex(IndexLinear(), dest)
        ΔI = first(idest) - first(isrc)
        checkbounds(dest, last(isrc) + ΔI)
        for I in isrc
            @inbounds dest[I + ΔI] = f(src′[I])
        end
    end
    return dest
end

Then

julia> @btime map_new!($identity, $b, $a);
  64.406 ns (0 allocations: 0 bytes)

julia> @btime copyto_new!($b, $a);
  64.493 ns (0 allocations: 0 bytes)
pablosanjose commented 4 years ago

It would be very important to clarify that linear indexing is used for map! in the docstring. It is currently not clearly specified, so a user cannot know what map!(identity, array3x3, array2x2) does without trying. In copyto!'s docstring it is clearer, as it specifies that "The first n elements of dest are overwritten, the other elements are left untouched." (emphasis mine)

pablosanjose commented 4 years ago

(I just noticed that even with the code above there is still a funny performance asymmetry when a is an Array and b is an OffsetArray, but that is probably a different issue)

julia> @btime copyto!($b, $a);
  68.142 ns (0 allocations: 0 bytes)
julia> @btime copyto!($a, $b);
  405.325 ns (0 allocations: 0 bytes)