Preconditioner: P^{-1} and not P

zimoun commented 8 years ago

in some applications (e.g., Boundary Element Method / BEM), it is possible to directly build an approximation of the inverse, and then, Pl\(A*(Pr\x)) e.g. l.81 in gmres.jl. does not make sense.

For example, the classical operator preconditioning of the integral single layer operator V is the integral double layer W, i.e, the preconditioned system reads W*V x = W*b. And both are dense operators. Therefore, it is not possible to use this version of gmres, since it should provide inv(W) which is clearly not accessible.

It seems more coherent to give a Linear Operator as preconditioner such that Ml*(A*(Mr*x)) approximates the identity. And not Pl\(A*(Pr\x)). Otherwise it is confusing.

The user should be free to evaluate as he wants the result of P\, and he should only provide this, not P. The use of \ is restricting.

andreasnoack commented 8 years ago

You are free to define \ as you'd like for your preconditioner so you are not restricted. E.g. something like

immutable InvIntegralDouble
    invMat::Matrix{Float64}
end
import Base: *,\
(*)(A::InvIntegralDouble, x::Vector) = error("you don't wanna do this")
(\)(A::InvIntegralDouble, x::Vector) = A.invMat*x

zimoun commented 8 years ago

Yes, I agree. We can work around with the nice type system.

However, from my point of view, it is confusing.

Moreover, from my knowledge, the classical preconditoners are defined by matrix-vector product * which is already the invert, and not by something which needs then to be inverted \. I have in mind all the stationary methods, incomplete LU factorizations, operator preconditoning etc. All they provide the preconditioned matrix-vector product, and not a linear operator where a full solve is then applied.

Therefore, from my point of view, Ml*(A*(Mr*x)) should be the default, because it is more used and clearer than Pl\(A*(Pr\x)).

cortner commented 8 years ago

For what it's worth, I prefer to think of the preconditioner as an approximation to the operator and not to the inverse of the operator.

If there is significant pull in either direction, then one can just void * and \ and use descriptive names as Tim Holys implementation of preconditioning in Optim.jl does.

cortner commented 8 years ago

Related issue: Optim.jl #188. It would be nice to implement preconditioning in a consistent way across IterativeSolvers and Optim; and possibly other use cases. I would be keen to contribute to this.

zimoun commented 8 years ago

I agree: "Creating a layer of abstraction and fixing the terminology a bit could fix some of this."

My point here was initially to fix the terminology. Since it seems that the terminology depends on the background we are talking from, I am on the same page to propose a common abstraction and use descriptive names.

cortner commented 8 years ago

The longer I think about it --- to get this right the first time round, I'd like to take a poll amongst some colleagues who are experts on preconditioning and ask their opinion. Aside from the P sim A or P sim A^{-1} issue, is there something else that one can ask?

zimoun commented 8 years ago

I do know if I am considering as an expert about preconditioning but I have done some work around.

My views are: let P an operator such that P*A or A*P converges faster than A alone. The key point is that the evaluation of matrix-vector by P needs to be the cheapest as possible. And P*A is the preconditioned system. It should be written $M^{-1} A$, but what is the meaning of ^{-1}, how is it evaluated, i.e., it is part of the preconditioned system. And compute ^{-1} by \ fixes by default one method, roughly speaking the full LU factorization, which is confusing.

To be more concrete:

Sometimes, you build an approximation of the operator itself, and this approximation is easiest to invert (more numerical stability etc. e.g., I have in mind the Shift-Laplacian preconditioner). And the techniques to invert this approximation is also part of the "preconditioning scheme" (e.g., full LU factorization, or Incomplete LU factorization, iLU(0) or iLU with threshold, with Crout, etc.)
Sometimes, you directly build an approximation of the inverse of the operator. I have in mind operator preconditioning from Integral Equation. Or SParse Approximate Inverse (SPAI). Or stationary methods such that Jacobi, Gauss-Seidel, etc. Or rational preconditioners such that Tchebytcheff acceleration. Or Multigrid method. Or Domain Decomposition algorithm such alternative Schwarz. etc.

In both case, the provided object should be the matrix-vector product, because this is what the solver requires. And if the evaluation of ^{-1} is needed, then it should be done outside the solver, by any methods of choice, up to the user. Because it is part of the preconditioner.

An abstract type should be useful to fix the terminology and to provide easy and transparent mechanisms, such those shortly presented by @andreasnoack

Nice if you share the opinion of your colleagues. :-)

cortner commented 8 years ago

More later, but just briefly: as mentioned before, * and \ can be given any meaning you want, so it is not the restriction you assume it is.

Sent from my iPhone

On 10 May 2016, at 19:12, zimoun notifications@github.com<mailto:notifications@github.com> wrote:

I do know if I am considering as an expert about preconditioning but I have done some work around.

My views are: let P an operator such that P_A or A_P converges faster than A alone. The key point is that the evaluation of matrix-vector by P needs to be the cheapest as possible. And P*A is the preconditioned system. It should be written $M^{-1} A$, but what is the meaning of ^{-1}, how is it evaluated, i.e., it is part of the preconditioned system. And compute ^{-1} by \ fixes by default one method, roughly speaking the full LU factorization, which is confusing.

To be more concrete:

Sometimes, you build an approximation of the operator itself, and this approximation is easiest to invert (more numerical stability etc. e.g., I have in mind the Shift-Laplacian preconditioner). And the techniques to invert this approximation is also part of the "preconditioning scheme" (e.g., full LU factorization, or Incomplete LU factorization, iLU(0) or iLU with threshold, with Crout, etc.)
Sometimes, you directly build an approximation of the inverse of the operator. I have in mind operator preconditioning from Integral Equation. Or SParse Approximate Inverse (SPAI). Or stationary methods such that Jacobi, Gauss-Seidel, etc. Or rational preconditioners such that Tchebytcheff acceleration. Or Multigrid method. Or Domain Decomposition algorithm such alternative Schwarz. etc.

In both case, the provided object should be the matrix-vector product, because this is what the solver requires. And if the evaluation of ^{-1} is needed, then it should be done outside the solver, by any methods of choice, up to the user. Because it is part of the preconditioner.

An abstract type should be useful to fix the terminology and to provide easy and transparent mechanisms, such those shortly presented by @andreasnoackhttps://github.com/andreasnoack

Nice if you share the opinion of your colleagues. :-)

You are receiving this because you commented. Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/IterativeSolvers.jl/issues/71#issuecomment-218242883

zimoun commented 8 years ago

I know. As I said before. The type system is nice. It is not my point.

I just find totally confusing to provide P and to overload \ because it is hard coded in the solver, when I just have built the matrix-vector product *. Because I have no preconditioner in mind which uses the full \ and all the preconditioner that I know provide algorithms to compute a matrix-vector product.

To be concrete, I find much more clearer the behavior of Scipy than the behavior of Matlab.

I am sorry if my points are not enough clear. But I do not know how to explain better. From my point view, mimicking the behavior of Matlab is a bad decision and I have tried to explain why (I am not sure to read arguments for the current behavior). And I understand the type system, and so, instead we should use + (I'm kidding ;-), but it appears more confusing. Maybe I have wrong.

At the end, I am not blocked on my views. I have just opened this issue to clarify, document, and as you said add coherence with other parts. Right now, the behavior of gmres does not appear to me as the expected one, and so, the question is: how to fix/improve ? Whatever if I disagree the choice :-)

cortner commented 8 years ago

So, all the opinions I sampled agree and they unfortunately agree with the current implementation, and with my own intuition as well. Andy Wathen (author of IFISS) put it very nicely:

Question: If A is a matrix which you precondition with P = M^{-1} so that P * A = M^{-1} A is “better behaved” than A was. Which do you call the preconditioner: P or M?

Andy's response: definitely M. The point is that you can conceive of M being an approximation for A, but it is harder to think of P as some kind of approximation for A^{-1} which you don't generally know so much about.

See also his Acta Numerica review: Wathen, A.J., 2015, `Preconditioning', Acta Numerica {\bf 24}, pp.~329--376.

So my proposal now is to not change a thing in the current notation (at least for now) But that shouldn't stop us from discussing a more descriptive terminology. But that should probably be a separate issue.

cortner commented 8 years ago

P.S.: so I suggest to close this.

zimoun commented 8 years ago

I really do not understand. And I know this paper. And there is also the book by Saad (creator of GMRes) (see there for the book, in particular, ch. 9, 10 and 12)

With the all respect, I think the point of view is incomplete: because what you want as "preconditioning scheme" is an efficient way to compute M^{-1}, and not only an approximation M of A. Call M as preconditioner, I agree. But my point is: P should be provided to the solver and not M.

I will not dive into trolling discussion: I think, it is a bad choice and the Scipy one is better.

i) If you read your cited paper, and you use incomplete factorization (sec. 3.2), that's mean, you have: A = L*L'+ E you will pass L and L' as left/right preconditioner. And everything is right with the current behavior.

ii) However, if you give a look at the section 3.5 of the paper: "To be consistent with the nomenclature in this article, P = R^{-1} is the preconditioner here; solution of systems with P is, of course, simply multiplication by R in this situation."

What is the issue I am trying to talk about. From my point, this leads to confusion about what the solver requires. [note: from my knowledge, there is more preconditioning scheme working as ii) than i)]

Current situation: i) L = incomplete_cholesky(A) then solver(A, b, L, L') [note: incomplete_cholesky is a variation of algo. 10.2 p.305 from Saad's book]

ii) P = spai(A) then work around with the type system to overload / and then solve(A, b, Poverloaded). [note: spai is given by algo. 10.12 p.340 from Saad's book]

I find ii) confusing. Because people will do solve(A, b, P), which will not work. Then they will give a look at the solver and they will see M\A*x, they will think: oh, there is a full solve, so let just replace this line by M*A*x. And they will never play around the type system, especially if they come from Matlab. Maybe people is only me. And I have wrong.

My proposed solution: provide a matrix-vector as preconditioning scheme.

The examples become:

function solver(A, b, P)
  # stuff
  y = P*(A*x)
  # stuff
end

i)

L = incomplete_cholesky(A)
function iLLmv(x) 
   y=L'\(L\x) 
   return y
end
solver(A, b, iLLmv)

ii)

P = spai(A)
function Pmv(x) 
  y=P*x 
  return y 
end
solve(A, b, Pmv)

cortner commented 8 years ago

Julia aims to achieve a notation that is descriptive. If we agree to call M ≈ A the preconditioner then it doesn't matter how it is stored, you want to use the notation M \ A and M * A as is done at the moment. Everything else is a discussion about what goes on behind the scenes. It just needs to be ensured that it is well documented what is meant by a pre-conditioner and how it is applied in which context.

zimoun commented 8 years ago

Yes, and my point is that the notation is not well descriptive. If you are not even convinced by the citation of the paper you mention, it is sad for me. :-)

It is now 10 years that I am working around precondtioners, in FEM and BEM context. You can find a question when I was a young master student on the help of Matlab (see there). The Matlab behavior is confusing and Julia is reproducing the same.

The confusion comes from the notation and the name.

we note the preconditioned linear system M^{-1}A,
and we call M the preconditioner,
but the precondtioning scheme defines an algorithm to efficiently compute the matrix-vector M^{-1}

If you give a look at the paper you mention, or at the Saad's book, all the preconditioning techniques provide somehow an algorithm to evaluate the matrix-vector product by M^{-1}.

If we examine the iterative solvers, they never need the matrix-vector by M, but always the matrix-vector by M^{-1}.

Provide M which is in general never explicitly built, and then the evaluation of the matrix-vector product by M^{-1} is done by \ inside the solver(somewhat hidden), which means in most of the cases play around the type system to provide \ as the matrix-vector function that we have implemented (as preconditioning scheme).

I maintain my points: the current behavior is not descriptive, natural and explicit, but it is confusing.

Well, I am repeating myself, so I close the bug.

And I will patch gmres (change just 3lines) and do my stuff. I hope that enough manpower will write the documentation and how to efficiently work around the type system.

mikehui commented 7 years ago

I come to this page when I encountered problem with preconditioning. Now I understand what to do. Matlab has the both way. If a matrix is given as a preconditioner, it will be used for "\backslash". If an operator is given for preconditioning, it will be used as "*". This is very convenient. Why not add this to Julia?

JuliaLinearAlgebra / IterativeSolvers.jl

Preconditioner: P^{-1} and not P #71