Open jishnub opened 6 months ago
This also appears to be a regression from v1.10.1, where both the operations are equally performant:
julia> a = rand(100,100); b = similar(a); av = view(a, axes(a)...); bv = view(b, axes(b)...); bv2 = view(b, UnitRange.(axes(b))...);
julia> @btime copyto!($bv, $av);
9.646 μs (0 allocations: 0 bytes)
julia> @btime copyto!($bv2, $av);
9.560 μs (0 allocations: 0 bytes)
julia> versioninfo()
Julia Version 1.10.1
Commit 7790d6f0641 (2024-02-13 20:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
LD_LIBRARY_PATH = :/usr/lib/x86_64-linux-gnu/gtk-3.0/modules
JULIA_EDITOR = subl
Bisected the regression to #51760
commit f0a28e9a45a34f1c524b3cf02cbbac1351f1da81 (HEAD)
Author: Jameson Nash <jameson@juliacomputing.com>
Date: Tue Oct 24 16:07:08 2023 -0500
add unsetindex support to more copyto methods (#51760)
On df39cee3d8723d04b8ba73b13e59d45d195e8f0b,
julia> a = rand(100,100); b = similar(a); av = view(a, axes(a)...); bv2 = view(b, UnitRange.(axes(b))...);
julia> @btime copyto!($bv2, $av);
2.428 μs (0 allocations: 0 bytes)
whereas on f0a28e9a45:
julia> @btime copyto!($bv2, $av);
20.690 μs (0 allocations: 0 bytes)
Even with the revert of https://github.com/JuliaLang/julia/pull/51760 there still seem to be some slowdown:
1.10:
julia> @btime copyto!($bv, $av); # fast, indices are Base.OneTos
3.695 μs (0 allocations: 0 bytes)
julia> @btime copyto!($bv2, $av); # slow, indices are UnitRanges
3.522 μs (0 allocations: 0 bytes)
1.11:
julia> @btime copyto!($bv, $av); # fast, indices are Base.OneTos
1.158 μs (0 allocations: 0 bytes)
julia> @btime copyto!($bv2, $av); # slow, indices are UnitRanges
7.991 μs (0 allocations: 0 bytes)
But I am not sure it is bad enough to require a milestone...
Oddly, for me, the performance regression persists on a recently nightly:
julia> @btime copyto!($bv, $av); # fast, indices are Base.OneTos
2.375 μs (0 allocations: 0 bytes)
julia> @btime copyto!($bv2, $av); # slow, indices are UnitRanges
23.745 μs (0 allocations: 0 bytes)
julia> VERSION
v"1.12.0-DEV.560"
I also see a similar issue on the backports-release-1.11
branch. I'm unsure why the issue still persists, but perhaps we should add the milestone back.
julia> @btime copyto!($bv, $av);
2.597 μs (0 allocations: 0 bytes)
julia> @btime copyto!($bv2, $av);
24.499 μs (0 allocations: 0 bytes)
julia> versioninfo()
Julia Version 1.11.0-beta2.2
Commit 862f863e0f* (2024-05-29 10:49 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
LD_LIBRARY_PATH = :/usr/lib/x86_64-linux-gnu/gtk-3.0/modules
JULIA_EDITOR = subl
@KristofferC Did the performance improve for you after reverting the PR?
This performance difference appears to arise from a lack of vectorization in indexing. In the first case, the output of
contains
whereas
contains
In particular, if I add a
@simd
declaration, this appears to improve performance considerably:Versioninfo: