Closed maleadt closed 5 months ago
A few people expressed disappointment at deprecating this library. If one of them can fix these issues, they're welcome to take over maintenance.
Otherwise, we should proceed on the deprecation plan.
Note that both RecursiveFactorization and TriangularSolve.jl depend heavily on LV; @inbounds @fastmath
performs many times worse to the point of making these libraries go from being faster than BLAS below matrices that are 200x200 or so, to many times slower.
Thus they could be deprecated with it, or use explicit kernels.
Unfortunately, RFLU is still much faster than MKL below the 200x200 point on many people's computers
https://github.com/SciML/LinearSolve.jl/issues/357
But we could replace the @turbo
uses with manual kernels. There are only three things these libraries do with it:
This stack is a major contributor to compile times, and does not precompile well: it is a frequent source of bugs where precompiled code allocates, but recompiling it in a running REPL with Revise (through making meaningless changes) make those allocations go away. At JuliaCon, I was told that Clima people actually sited LoopVectorization.jl as why they're not using the SciML ecosystem, and was asked if we could get rid of it. I'm not affiliated with them in any way that I know of, so I don't see any reason to appease them, just pointing out that there are different perspectives on these tradeoffs.
I'm slightly tempted to try to take over maintenance, but I fear that if the maintenance burden has become unsustainable for you it'll just be that much more unsustainable for me.
What was the Clima folks' apparent objection, out of curiosity?
I'm slightly tempted to try to take over maintenance, but I fear that if the maintenance burden has become unsustainable for you it'll just be that much more unsustainable for me.
What was the Clima folks' apparent objection, out of curiosity?
It really shouldn't take much effort.
My suggestion, if the problem is limited to the Array
case because of the Memory
change, is to add special casing for Array
where they get turned into PtrArray
s. You'd then also need to make sure they get GC.@preserve
d.
I believe the segfaults in the tests from VectorizedReductions.jl and NaNStatistics.jl are caused by reducing over empty collections which leads to code along the lines of
A = Float64[]
out = zero(eltype(A))
@turbo for i in eachindex(A)
out += A[i]
end
LoopVectorization warns users not to do this, so the problem lies with VectorizedReductions and NaNStatistics (here it's the weighted versions of each statistic that has problems, unweighted stats correctly add the check_empty=true
flag).
I have not verified it, but I suspect PlmDCA has a similar problem. A quick search of the code reveals that it checks for empty collections in some places, but not others.
check_empty=true/false
didn't change between early Julia versions and 1.10.
We could make check_empty=true
the default.
Closing this because tests pass on 1.11, while check_empty=false
should've also caused segfaults in older Julia versions. https://github.com/JuliaSIMD/LoopVectorization.jl/actions/runs/8946228352
LoopVectorization.jl's generated IR seems to cause segfaults on 1.11, as observed on PkgEval with at least 6 packages (MCPhylo,jl, LocalPoly.jl, VectorizedReduction.jl, NaNStatistics.jl, TimeSeriesClassification.jl, PlmDCA.jl). See this report for details: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/2cbecf4_vs_18b4f3f/report.html
@chriselrod I'm opening a new issue because https://github.com/JuliaSIMD/LoopVectorization.jl/issues/518 was closed, and to list all issues in case somebody wants to tackle this.
Some of the errors that I've encountered:
An LLVM assertion, as seen with MCPhylo.jl (requires assertions build of Julia):
A segfault during
vload
, as seen with NaNStatistics.jl and PlmDCA.jl:A segfault during
vadd_fast
as seen with VectorizedReductions.jl:The source of bad IR hasn't been fully determined yet, but it seems to be the
Expr(:new)
that's generated to pass structs by value instead of by reference: https://github.com/JuliaLang/julia/issues/52702#issuecomment-1874492883.Deprecating LoopVectorization.jl isn't possible, because:
@turbo
changes semantics, https://github.com/JuliaSIMD/LoopVectorization.jl/pull/523#issuecomment-1884883071So the only solution forwards seems fixing LoopVectorization.jl. I've taken a first attempt at it in https://github.com/JuliaSIMD/LoopVectorization.jl/pull/523, but just removing the
Expr(:new)
optimization isn't sufficient, and there's other issues (see above).