Closed albert-de-montserrat closed 3 months ago
The regression was introduced by #152
This is an MWE that triggers the regression. Looks like that #152 introduced some changes in how indices are dealt with, and they are not well captured by closures anymore:
using ParallelStencil
using ParallelStencil.FiniteDifferences2D
@init_parallel_stencil(Threads, Float64, 2)
@parallel_indices (i,j) function foo1!(A::AbstractArray{T,2}, B::AbstractArray{T,2}) where T
A[i, j] = B[i+1,j] - B[i,j]
nothing
end
@parallel_indices (i,j) function foo2!(A::AbstractArray{T,2}, B::AbstractArray{T,2}) where T
dx(B) = B[i+1,j] - B[i,j]
A[i, j] = dx(B)
nothing
end
n = 32
A = zeros(n, n)
B = zeros(n, n)
r = 1:n-1, 1:n-1
@b @parallel $r foo1!($(A, B)...) # 2.489 μs (31 allocs: 4.031 KiB)
@b @parallel $r foo2!($(A, B)...) # 19.000 μs (3906 allocs: 64.578 KiB)
@albert-de-montserrat : sorry for the delay, I have been at JuliaCon and vacation. I will try to fix this ASAP
The pull request #155 fixes the issue:
julia> @belapsed @parallel $r foo1!($(A_ref, B)...) # 2.489 μs (31 allocs: 4.031 KiB)
2.6308888888888887e-6
julia> @belapsed @parallel $r foo2!($(A, B)...) # 19.000 μs (3906 allocs: 64.578 KiB)
2.6406666666666666e-6
julia> A_ref == A
true
and with polyester we get:
julia> @belapsed @parallel $r foo1!($(A_ref, B)...) # 2.489 μs (31 allocs: 4.031 KiB)
5.748633879781421e-7
julia> @belapsed @parallel $r foo2!($(A, B)...) # 19.000 μs (3906 allocs: 64.578 KiB)
5.563548387096774e-7
julia> A_ref == A
true
Solved in v0.13.2
We are having a lot of performance regressions in
JustRelax.jl
after switching fromv0.12.1
tov.0.13.0
. For starters, our CI went from 12mins to nearly 4h.In a more concrete example, the first time step of the heat diffusion solve (3D, with a 32^2 resolution), using 4 threads on both cases:
1.595059 seconds (589.13 k allocations: 45.257 MiB, 13.13% gc time, 111.39% compilation time)
84.653106 seconds (4.27 G allocations: 147.336 GiB, 13.47% gc time, 3.19% compilation time)
What did change in
0.13
? DidPolyester
actually became the default?