acroy / Expokit.jl

Julia implementation of EXPOKIT routines
Other
26 stars 11 forks source link

Avoid \ for non-allocating #20

Open ChrisRackauckas opened 6 years ago

ChrisRackauckas commented 6 years ago

When trying to avoid allocations, note that \ actually allocates. You will instead want to use A_ldiv_B! for the non-allocating form.

acroy commented 6 years ago

This refers to chbv! (e.g. here) and padm (e.g. here).

mforets commented 6 years ago

A_ldiv_B! requires that the matrix A is factorized; these are special types depending on the properties of A. Take for example (chbv.jl L92):

for i = 1:2*p
    w .+= (A - t[i]*I) \ (a[i] * vec)
end

Here A - t[i]*I changes inside the loop. I wonder if this is still a case where A_ldiv_B! gives more performance? Since A_ldiv_B!(y, factorize(A - t[i]*I), a[i] * vec) requires calling factorize for each i.

Note that A = factorize(A) and Di = Diagonal(fill(t[i], n)) are valid factorizations, although A - Di is not.

ChrisRackauckas commented 6 years ago

Here A - t[i]I changes inside the loop. I wonder if this is still a case where A_ldiv_B! gives more performance? Since A_ldiv_B!(y, factorize(A - t[i]I), a[i] * vec) requires calling factorize for each i.

Well two things here. One \ is internally calling factorize and doing this, so you'd just be going lower here. A - t[i]*I can be done inplace into a cache matrix. Then you can reduce an allocation here by doing factorize!(A - t[i]*I). Then make a[i] * vec in place into some cache vector. So that will get rid of two matrix allocations and one vector allocation (matrix because Julia cannot inplace factorize by default without your consent!). Whether that's enough to make a performance difference depends on the type of matrix and the size, among other things. Profiling would have to be done to measure how much it matters, but matrix allocations in an inner loop could be hurting and causing GC calls.

Next I think factorize should be a default arg, with allowing the user to give a choice for the linear solver method. Note that factorize is actually not type-stable, but the instability is contained because the result of A_ldiv_B! is ignored. But allowing the user to set it could speed up calculations on small matrices, and would let people do crazy things.

acroy commented 6 years ago

I tried this yesterday:

function chbv2!{T<:Real}(w::Vector{T}, A, vec::Vector{T})
    p = min(length(θ), length(α))
    scale!(copy!(w, vec), α0)
    tmp_v = similar(vec, Complex128)
    B = A - θ[1]*I  # make copy of A with diagonal elements

    @inbounds for i = 1:p
        scale!(copy!(tmp_v, vec), α[i])
        for n=1:size(B,1)
            B[n,n] = A[n,n] - θ[i]
        end
        A_ldiv_B!( factorize(B), tmp_v)
        w .+= real( tmp_v )
    end
    return w
end

and the number of allocations is basically reduced by a factor 2.

I thought factorize! was deprecated (julia#5526) though?

ChrisRackauckas commented 6 years ago

I thought factorize! was deprecated (julia#5526) though?

Then I would just get it to lufact! and have it be an option for the user to change. The matrix has to be square, right? So it should usually be the one that makes sense.

w .+= real( tmp_v )

you'll want that as

w .+= real.( tmp_v )

or

@. w == real(tmp_v)
acroy commented 6 years ago

lufact! only works for dense arrays, right? cholfact! requires symmetric/hermitian matrices. If we use an optional argument to provide an appropriate factorize! version, we probably have to default to factorize.