ddemidov / amgcl

C++ library for solving large sparse linear systems with algebraic multigrid method
http://amgcl.readthedocs.org/
MIT License
733 stars 112 forks source link

Block preconditioners #37

Closed Andlon closed 7 years ago

Andlon commented 7 years ago

Hi,

I'm looking to solve a linear system which is algebraically similar to the Stokes problem, in the sense that I have a matrix in the following block form:

[ A     B ]
[ B^T   0 ]

with A symmetric positive definite, and B has full row rank.

I am trying to find an efficient way to solve this system, and I hope to be able to adapt techniques one may use for the Stokes problem, as seen towards the end in this lecture by Wolfgang Bangerth. In this video, Bangerth advocates using the following (right) block preconditioner in conjunction with GMRES:

[ A^-1    -A^-1 B^T S^-1 ]
[ 0            S^-1      ]

where S = B^T A^-1 B. Of course, one does not know the exact inverses of A and S, so one would like to approximate these. For the Stokes problem, one can approximate S^-1 by the inverse of the mass matrix (but for my problem I need to use a different matrix that I hope is somewhat similar), and Bangerth suggests using "one step of multigrid" to approximate the action of applying A^-1.

I've been poking around the docs and the source code, and I think it should be possible to accomplish this kind of block-preconditioning with amgcl. Is this correct? I see that there's a block_preconditioner.hpp related to MPI, and also a schur_pressure_correction.hpp, both of which might be useful for me to understand how to assemble this, but I find it a little hard to decipher what's going on, especially since I'm not familiar with Schur pressure correction.

In particular, can I perform "one step" of AMG for the approximation of A^-1? I'm struggling to see how to assemble the pieces, and any suggestions (or confirmation on whether or not I can do what I want) would be most welcome!

P.S: I realize this is not truly an "issue", but I could not find any other place to ask questions to the community of users of amgcl. I hope this is OK.

ddemidov commented 7 years ago

I think that what you are looking for is schur_pressure_correction preconditioner. It will form the block submatrices based on your classification of equations. You need to set the pmask vector in the parameters, which for each of the equations contains either zero of one (one for 'pressure' equations, zero for 'flow' equations). You also need to define solvers for the 'pressure' and 'flow' parts of the system as template parameters. See example in examples/schur_pressure_correction.cpp.

Feel free to ask further questions, if this explanation is not clear enough (I am afraid it is not).

Andlon commented 7 years ago

Thanks for the very swift reply!

If I understand you correctly, the schur_pressure_correction preconditioner assembles each block for me. In my case, the blocks A and B are already constructed submatrices of much larger matrices, arriving from a constrained elliptic formulation rather than fluid dynamics. In fact, my problem has no relation to fluid dynamics or the Stokes problem I mentioned, other than the fact that it has an algebraically similar structure.

So, given that I already have obtained matrices A and B, I suppose I cannot use schur_pressure_correction directly, but must rather build my own block preconditioner similarly to the code used in schur_pressure_correction?

Andlon commented 7 years ago

Oh, and A is exactly equivalent to a P1 finite element stiffness matrix for the Laplace problem in my case, which is why I want to use AMG to approximate A^-1.

ddemidov commented 7 years ago

but must rather build my own block preconditioner similarly to the code used in schur_pressure_correction?

I think that is the case. If you look at the schur_pressure_correction::init method, it extracts the submatrices explicity, and then creates solvers for the diagonal blocks. I guess you could reuse most part of the class, or even introduce an alternative constructor that would take the preassembled blocks.

Note that you will still need the whole matrix for the krylov solver (that could be easily replaced with a matrix-less spmv operator though).

Andlon commented 7 years ago

Thanks for the pointers! I will dig into it and see if I can come up with something.

Andlon commented 7 years ago

Just a quick followup: I've been trying out some things, and I noticed that LGMRES sometimes breaks down with the following exception: "NaNs encountered in LGMRES". I took a look at the code, but couldn't see anything that should evoke this behavior. It seems GMRES does not break down for the same examples. I'm not really familiar with LGMRES, is there any reason LGMRES should fail when GMRES doesn't?

Also, I need to use a right preconditioner, so I suppose I would have to use LGMRES?

It's a little difficult to construct a minimal self-contained test case for the LGMRES breakdowns, but I can give it a try if you want!

ddemidov commented 7 years ago

There was an unrelated issue recently, which lead to #38. Can you try if lgmres from that branch works better for you?

To expand a bit, lgmres and fgmres used Householder rotations for orthogonalization, while gmres uses givens rotations. #38 rewrites lgmres in terms of Givens rotations.

Also, I need to use a right preconditioner, so I suppose I would have to use LGMRES?

You can use either lgmres or fgmres, they are both use right preconditioning (and lgmres may switch to left-preconditioning, see its pside paramter).

Andlon commented 7 years ago

Yes, thank you, #38 seems to work a lot better! I haven't finished my preconditioner yet (specializations gave me a headache for a while, but I think I've got it now), so the end result remains to be seen, but I'm not getting any NaNs at least.

Andlon commented 7 years ago

Just a heads-up:

I recently checked out master after having used the (now gone) #38 branch, and I had problems with LGMRES not converging again. The following restored the convergence:

params.solver.always_reset = true;
params.solver.store_Av = false;

(I really needed both, simply changing one was not sufficient)

Unfortunately I have no way simple way to reduce this to a simple example... It's really rather involved.

ddemidov commented 7 years ago

I can understand why always_reset=false could break things, but value of store_Av should not really matter. Can you confirm the problem persists with always_reset=true/store_Av=true?

Does 4db78b65281db89f49d2bec3d3c6481fdb0319e8 work for you? If yes, could you please try to git bisect between it and master?

Alternatively, you could save matrices and rhs vectors with amgcl::io::mm_write. I assume you are solving several systems in a row, otherwise always_reset should not matter as well? So to reproduce this I think I'll need two problems: the one that is solved successfully, and the next one which fails with store_Av=true.

Andlon commented 7 years ago

Results on master (4c258b213ae0455e6c5dddafd174f4dcc32b3c71):

always_reset = true; store_Av = false; => Success

always_reset = true; store_Av = true; => Failure

I also tried on 4db78b6 as you suggested, but with the same behavior.

I also realized that I haven't been very precise in my earlier description. Strictly speaking, I haven't checked that LGMRES doesn't converge, I've merely observed that certain properties that I expect to hold for my systems (I check these with randomized property-based tests) don't hold anymore with store_Av = false, which indicates that LGMRES doesn't converge to what I would expect.

Also, my system is a block-matrix and implemented as such (that is, I implement custom matrix-vector products), and the preconditioner is also similarly implemented, so I cannot simply store them in matrix market format...

In any case, it is actually possible that what I'm seeing is not due to LGMRES, but my own implementations. I.e. if I've implemented matrix-vector products, residuals, or my preconditioner wrong.

I'll investigate further!

Andlon commented 7 years ago

And yes, I use the same system to solve multiple right-hand sides. 3, in fact.

Just to provide some context to what I do, here's an approximate high-level description:

for every triangle in triangulation:
    - create a "submesh" by forming a "patch" consisting of the current
      triangle and nearby triangles which are reachable by at most m edges
    - with each patch, there's the following block system associated with it:
              [ A    B^T ]
              [ B     0  ]
      with A an n x n matrix, and B an m x n matrix
    - for each local vertex in the triangle, solve the above system with
      LGMRES for a different RHS, using the *right* preconditioner
              [ A^{-1}       A^{-1} B^T S^{-1} ]
              [    0               - S^{-1}    ]
      where A^{-1} is approximated by AMG, and S^{-1} by the application
      of some other matrix

In my automated test suite, I generate random triangulations and verify that that the solutions x_i for i = 1, ..., num dof all satisfy certain properties, such as B x_i = 0 and a more important property which verifies that the solutions are a-orthogonal to the null space of B. That is, if N is a basis of the null space of B and W is defined such that W_ij = (x_i)_j (i.e. stacking x_i as row vectors in a matrix, then I expect the following property to hold:

(V - W) A N = 0

Here V is a different matrix that corresponds to the projected weights of a standard FEM lagrange basis from a quasi-uniform mesh into a refined mesh.

The fact that the aforementioned properties don't hold with store_Av = true are my indication that something is wrong. With some effort, I can write a test case that looks at the solution of a single system (i.e. for a single triangle), but since the system matrix and preconditioner are defined in terms of custom block matrices, I cannot make a self-contained example very easily...

Andlon commented 7 years ago

Also, I tried using the equivalent of an identity preconditioner, and it looks like it doesn't suffer from the same issue, because the results with and without store_Av = true don't seem much different, unlike when I use my own preconditioner, in which case the results are very different.

I can't think of a simple way to come up with an easily reproducible example. My code is on github though, if it helps: https://github.com/Andlon/crest/blob/master/include/crest/basis/amg_corrector_solver.hpp

Andlon commented 7 years ago

OK, here's another update:

it seems that the reason that the properties fail is that LGMRES reached its maximum number of iterations. I bumped it up to 100 000, and it's still hitting the maximum number of iterations, and I observe the same errors, so it seems it's basically getting stuck instead of making progress.

Andlon commented 7 years ago

Sorry for the spam. I discovered that it relates to my setting for param.solver.tol, which I'd set to machine epsilon. Setting it to 1e-15 seems to work. I don't quite understand why this would make the whole thing fail, however. Shouldn't it just run to the maximum number of iterations, but still manage an accurate solution? I couldn't see tol or eps being used in anything else than comparisons...

ddemidov commented 7 years ago

Its possible that what you are seeing are just numerical effects. From my experience I would say that even 1e-15 is too strong. Iterative methods usually do not compute residual explicitly, but keep track of it using implicit formulas which are only guaranteed to be true in precise arithmetic. With the following patch we can see this effect even for simple Poisson problem:

diff --git a/examples/solver.cpp b/examples/solver.cpp
index 945bdb8..1e7e539 100644
--- a/examples/solver.cpp
+++ b/examples/solver.cpp
@@ -385,7 +385,12 @@ int main(int argc, char *argv[]) {
         amgcl::io::mm_write(vm["output"].as<string>(), &x[0], x.size());
     }

+    double norm_rhs = sqrt(amgcl::backend::inner_product(rhs, rhs));
+    amgcl::backend::spmv(-1.0, boost::tie(rows, ptr, col, val), x, 1.0, rhs);
+    double resid = sqrt(amgcl::backend::inner_product(rhs, rhs)) / norm_rhs;
+
     std::cout << "Iterations: " << iters << std::endl
               << "Error:      " << error << std::endl
+              << "Real error: " << resid << std::endl
               << prof << std::endl;
 }
$ ./solver -n 64 -p solver.tol=1e-10 | grep Error -C 1
Iterations: 13
Error:      4.80866e-11
Real error: 4.80869e-11

$ ./solver -n 64 -p solver.tol=1e-12 | grep Error -C 1
Iterations: 14
Error:      8.91134e-13
Real error: 9.01515e-13

$ ./solver -n 64 -p solver.tol=1e-14 | grep Error -C 1
Iterations: 16
Error:      7.66414e-15
Real error: 1.42898e-13

$ ./solver -n 64 -p solver.tol=1e-16 | grep Error -C 1
Iterations: 18
Error:      6.77496e-17
Real error: 1.44484e-13

You can see that after some point the real error does not get any better (and in fact deteriorates a bit).

Andlon commented 7 years ago

Yes, that was my conclusion too. At first, I didn't realize that the tolerance given is essentially a relative error with respect to the RHS, and in that case 10e-15 * norm_rhs() can easily be too small.

Still, the strange artifact that I observed was that using store_Av = true would make the solution diverge completely, instead of stagnating at some fairly accurate solution as in the case of your Poisson example.

By the way, giving a tolerance which is relative to the right-hand side is... somewhat inconvenient when you know little-to-nothing about your RHS. I see that the implementation of LGMRES in SciPy uses both relative and absolute tolerance for its stopping criterion. I had barely heard of LGMRES before seeing it in your library, so I'm an absolute novice, but perhaps there's room for a more flexible stopping condition?

ddemidov commented 7 years ago

It should be possible to alter the exit condition to include checks both for relative and absolute error. I am not sure how to expose this to the user though: SciPy uses single parameter for both checks, while PETSC allows to set relative and absolute tolerances separately. Do you have any preferences here?

Andlon commented 7 years ago

To tell you the truth, I don't feel particularly qualified to give an answer to that. However, I suppose being able to set relative and absolute tolerances is basically a generalization of having a single tolerance, and so unless it's very inconvenient to implement, it would give the user the most flexibility without much loss to usability in the case when the user wants to use a single tolerance value.

Andlon commented 7 years ago

I'm closing this issue now since there is nothing more to discuss. My solver ended up working very well. Thank you so much for your assistance!