JuliaSmoothOptimizers / LimitedLDLFactorizations.jl

Limited-Memory Factorization of Symmetric Matrices
Other
21 stars 10 forks source link

Solve systems with multiple rhs #81

Closed geoffroyleconte closed 1 year ago

geoffroyleconte commented 1 year ago

Hello @mzy2240, does this fix your issue?

@dpo I copy-pasted some code that was in LDLFactorizations.jl for multiple rhs.

codecov[bot] commented 1 year ago

Codecov Report

All modified lines are covered by tests :white_check_mark:

Files Coverage Δ
src/LimitedLDLFactorizations.jl 89.81% <100.00%> (+0.89%) :arrow_up:

:loudspeaker: Thoughts on this report? Let us know!.

mzy2240 commented 1 year ago

That's awesome! Thank you!

I tried lldl(A; memory=size(A,1)) to give an exact factorization, and then \ B does not give the same result as A \ B. Did I mis-understand anything here?

geoffroyleconte commented 1 year ago

Is the norm of the difference between the solution with lldl and \ high? If so can you send me a basic working example so that I can reproduce the issue?

mzy2240 commented 1 year ago

Ok the original PR gives accurate result after re-testing. However, I did a few tweaks to make it multi-threading, which surprisingly gives totally different result. See below:

function lldl_lsolve!(n, X::AbstractMatrix, Lp, Li, Lx)
  Threads.@threads for j = 1:n
    @inbounds for p = Lp[j]:(Lp[j + 1] - 1)
      @simd for k ∈ axes(X, 2)
        X[Li[p], k] -= Lx[p] * X[j, k]
      end
    end
  end
  return X
end

function lldl_dsolve!(n, X::AbstractMatrix, D)
  Threads.@threads for j = 1:n
    @simd for k ∈ axes(X, 2)
      X[j, k] /= D[j]
    end
  end
  return X
end

function lldl_ltsolve!(n, X::AbstractMatrix, Lp, Li, Lx)
  Threads.@threads for j = n:-1:1
    @inbounds for p = Lp[j]:(Lp[j + 1] - 1)
      @simd for k ∈ axes(X, 2)
        X[j, k] -= conj(Lx[p]) * X[Li[p], k]
      end
    end
  end
  return X
end

function lldl_solve!(n, B::AbstractMatrix, Lp, Li, Lx, D, P)
  @views Y = B[P, :]
  lldl_lsolve!(n, Y, Lp, Li, Lx)
  lldl_dsolve!(n, Y, D)
  lldl_ltsolve!(n, Y, Lp, Li, Lx)
  return B
end

Any idea why it gives wrong result?

mzy2240 commented 1 year ago

I think I found the reason. lldl_lsolve! and lldl_ltsolve! are not embarrassingly-parallelable. Any idea how to make it more thread-friendly?

dpo commented 1 year ago

Thanks @geoffroyleconte. There could also be methods for right-hand sides that are contiguous in memory, so that you can use x[j, :] instead of a for loop over the second axis.

geoffroyleconte commented 1 year ago

@mzy2240 maybe you are modifying some elements in X simultaneously, in which case you would need to use Threads.Atomic?

I'm not an expert in this subject, @amontoison do you know what could be wrong here?

amontoison commented 1 year ago

You should permute the for loops if you want to use Threads.@thread. It's normal that you have a wrong result, you can't perform backward and forward sweeps in any order.

But you do a complete backward and forward sweep of the columns of B on different threads.

amontoison commented 1 year ago

@geoffroyleconte Can we merge this PR?

amontoison commented 1 year ago

@geoffroyleconte Can you rebase your branch?

amontoison commented 1 year ago

Thanks @geoffroyleconte!