Open BioTurboNick opened 1 year ago
Probably gonna need some code as well.
Found a minimal example. It appears to be when the length of an array is passed into a type parameter.
using BenchmarkTools
using StaticArrays
const xx = [1; 2]
@noinline function e(x)
k = length(x)
SVector{k}(x)
end
@btime e(xx)
# 1.9-rc1: 27.898 μs (31 allocations: 2.66 KiB)
# 1.8.5: 762.239 ns (10 allocations: 544 bytes)
Looks like #45062 ?
Linking my comment here: https://github.com/JuliaLang/julia/issues/48612#issuecomment-1484974814
TL;DR: If a performance regression is determined to be unavoidable for a particular release, can it be well-documented, users warned if affected pattern found, and/or could a workaround be provided; or is some hacky patch possible to improve specific situations?
An aim being to reduce surprise and effort in troubleshooting if and when users hit it, and enable decision making when it comes to upgrading from 1.8 to 1.9.
An easy fix on StaticArrays.jl
's side is replacing
@propagate_inbounds (::Type{SA})(a::AbstractArray) where {SA <: StaticArray} = convert(SA, a)
with
@propagate_inbounds (T::Type{<:StaticArray})(a::AbstractArray) = convert(T, a)
This make sure we can skip the unneeded runtime subtyping.
The workaround here seems to be @noinline SVector{k}(x)
.
I guess the tradeoff here is whether to hit a dynamic dispatch when calling the uninferrable SVector{k}
constructor, or inline it but then risk the body that is inlined to be more expensive than the dynamic dispatch?
It seems the choice made here is not the best.
Yeah, we may need a heuristics fix for https://github.com/JuliaLang/julia/pull/45062, since right now it is likely pretty far off in the cost model
1.9-rc1: Time:
Allocations:
1.8.5: Time:
Allocations:
With StaticArrays v1.5.19 on each