Open Drvi opened 7 months ago
This seems to have gotten even worse effectwise on a 2-days old master, now tainting everything:
julia> foo(tf, args...) = sum(x->tf(args...), 1:100000000)
foo (generic function with 1 method)
julia> Base.infer_effects(foo, (typeof(^),Int,Int))
(!c,!e,!n,!t,!s,!m,!u)′
julia> versioninfo()
Julia Version 1.11.0-DEV.1456
Commit d54a4550cbe (2024-02-02 07:09 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: 24 × AMD Ryzen 9 7900X 12-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver4)
Threads: 23 default, 1 interactive, 11 GC (on 24 virtual cores)
Environment:
JULIA_PKG_USE_CLI_GIT = true
Though surprisingly the performance is better:
# master
julia> @btime foo($^, 10, 3)
161.148 ms (0 allocations: 0 bytes)
100000000000
# 1.10.0-beta3
julia> @btime foo($^, 10, 3)
174.397 ms (0 allocations: 0 bytes)
100000000000
I have tested the MWE above, and I can only reproduce the regression on an AMD, but not on an intel machine.
My AMD machine
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 × AMD EPYC 9374F 32-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)
My intel machine
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13950HX
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, goldmont)
Threads: 1 default, 0 interactive, 1 GC (on 32 virtual cores)
I also get a regression on this case (but only on AMD):
using BenchmarkTools
const v=rand(10_000);
gt(x)=x>0.0
@btime maximum(Iterators.filter(gt, v))
Julia 1.9.4
10.709 μs (1 allocation: 16 bytes)
julia 1.10.2
37.789 μs (1 allocation: 16 bytes)
also in my case on julia 1.11.0-alpha1 is even worse
51.320 μs (1 allocation: 16 bytes)
On macbook m2, the difference is even larger – about 2x regression in 1.10: from 187.465 ms
to 326.619 ms
.
The inference difference
#1.9
pairs(::NamedTuple{(), Tuple{}})::Core.Const(Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}()) (+c,+e,+n,+t,+s,+m,+i)
#1.10
pairs(::@NamedTuple{})::Core.Const(Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}()) (+c,+e,+n,+t,+s,!m,+i)
This is inside _sum(f, a, ::Colon; kw...)
We've just noticed a regression with this MRE:
foo(tf, args...) = sum(x->tf(args...), 1:100000000)
1.9.2:
1.10.0:
But the regression is present also at a very recent tip of the backports-release-1.10 branch. Note that
foo($+, 10, 3)
has the same perf on both 1.9 and 1.10Not sure if relevant, but the inffered effects seem different between 1.9 and 1.10: 1.9
1.10