High condition number in Mesh::gaussj

loeiten commented 8 years ago

The int Mesh::gaussj(BoutReal **a, int n) in mesh.cxx may have a high condition number, which scales badly with increasing number of grid points. It is used to find the contravariant components of the metric tensor (if no covariant metric tensor is found) given the covariant components, and vice versa,

The problem becomes big if one wants to use the Christoffel symbols, which are using the derivatives of the metric tensor components.

The enhance request is therefore to change the direct solver into something else which is not so sensitive to the condition number.

d7919 commented 8 years ago

This is a Gauss-Jordan elimination implementation which inverts the metric tensor. Presumably it would be easy enough to write something that uses lapack to do this inversion (dgetrf and dgetri I think) for each mesh point. The GJ implementation uses pivoting, which should help avoid some issues; if there's a large variation in the output for small changes to the input then there may be a region where some of the metric elements are very small/large which is causing this. It's not clear that an alternative approach would improve on this in all circumstances but it's probably worth a quick study.

Have you got a test case that demonstrates the issue?

For reference there's a nice simple example of inverting with lapack at http://stackoverflow.com/questions/3519959/computing-the-inverse-of-a-matrix-using-lapack-in-c

loeiten commented 8 years ago

Hmm...indeed. I don't have too much experience with solutions to these kind of problems, I just remembered vaguely something about changing to an iterative solver could reduce the problem of the condition number (but I might be wrong).

I have something which reproduces the problem, but not a dedicated test case. I stumbled upon the problem when trying to MMS the Christoffel symbols in the vector_operators branch.

With the following parameters in the BOUT.inp file

[mesh]
(...)
# ------------------------------
# Contravariant metric components
# ------------------------------
g11 = 2 
g12 = 4*geom:xl
g13 = 4*geom:xl - 2 
g22 = 8*geom:xl^2 + 1 
g23 = 8*geom:xl^2 - 4*geom:xl + 1 
g33 = 8*geom:xl^2 - 8*geom:xl + 5 
# ------------------------------
# Covariant metric components
# ------------------------------
# g_11 = 4*geom:xl^2 + 1 
# g_12 = -2*geom:xl - 1/2 
# g_13 = 1/2 
# g_22 = 3/2 
# g_23 = -1/2
# g_33 = 1/2 
# ------------------------------

[geom]
Lx = 2*pi                # The length of x from boundary to boundary
Ly = 2*pi                # The length of y from boundary to boundary
xl = x * geom:Lx         # x in range [0,Lx]
yl = y * geom:Ly/(2*pi)  # y in range [0,Ly]

The problem is illustrated below, and for this particular example, it seems like it is only a problem when taking derivatives of the metrics.

In the plot below, BOUT++ is calculating the covariant metric tensor. g1_11

In the plot below, BOUT++ reads in the covariant metric tensor from the input file. g1_11bothgiven

1028 points covariant_components 1028 points zoom in covariant_components-1 32 points zoom in covariant_components-2

d7919 commented 8 years ago

Ah yes, it looks like you're getting problems with cancellations not happening precisely enough. I think you're correct that an iterative solver might be useful in such a case.

As the matrix to be inverted is only 3x3, I though it might be interesting to try writing a routine which implements the explicit inverse (e.g. finding det(M) and the co-factors matrix). I would guess this could help although there will probably still be some potential cancellation errors.

I've added a branch to test this out (https://github.com/boutproject/BOUT-dev/tree/explicitInverse). This currently has a very rough and untested implementation of a direct inversion. Let me know if this behaves/has the same problems/is even worse.

bendudson commented 8 years ago

If your scheme is this sensitive to errors in the metric through the Christoffel symbols then it might be worth looking at rewriting your expressions in a way which doesn't use them. At the moment you're differencing the vector components, then differencing the basis vectors, then combining them. This relies on lots of cancellations being very precise. Is it possible to put metric tensor components and vector components under the same derivatives? An example is the divergence operator, which can either be written with the J*g^ij factor inside the derivative with the vector (the correct way in my opinion), or split into the G1.. G3 factors which then replies on accurate derivatives of metrics.

-------- Original Message -------- From:David Dickinson notifications@github.com Sent:Tue, 15 Dec 2015 06:42:24 -0800 To:boutproject/BOUT-dev BOUT-dev@noreply.github.com Subject:Re: [BOUT-dev] High condition number in Mesh::gaussj (#132)

Ah yes, it looks like you're getting problems with cancellations not happening precisely enough. I think you're correct that an iterative solver might be useful in such a case.

As the matrix to be inverted is only 3x3, I though it might be interesting to try writing a routine which implements the explicit inverse (e.g. finding det(M) and the co-factors matrix). I would guess this could help although there will probably still be some potential cancellation errors.

I've added a branch to test this out (https://github.com/boutproject/BOUT-dev/tree/explicitInverse). This currently has a very rough and untested implementation of a direct inversion. Let me know if this behaves/has the same problems/is even worse.

— Reply to this email directly or view it on GitHub.

loeiten commented 8 years ago

Thank you both for your fast replies.

@d7919 That was an impressively quick fix. Unfortunately I do not have too much time to look at it now, but if I eventually get time I can see if it has the same problems.

@bendudson Good suggestion. I haven't found any good way to reformulate the problem though, but that doesn't mean they don't exist.

This relies on lots of cancellations being very precise

Are you referring to cancellations in calculations of the Christoffel symbols (the last minus in equation 2.6.25 in D'Haeseleer)? Other than the loss of precision with that particular minus (which I guess would not lead to any serious problems), I cannot really see the hazards of using the Christoffel symbols (given that they are properly implemented) unless they for some reason should give some numerical instabilities unkown to me.

Christoffel symbols aside: As mentioned above, there could be special cases where noise in the inverted metric tensor elements could give troubles as derivatives of these elements are used in several operators.

d7919 commented 6 years ago

This has probably been updated by #883 but may not have been fixed.

boutproject / BOUT-dev

High condition number in Mesh::gaussj #132