ExtremeFLOW / neko

/ᐠ. 。.ᐟ\ᵐᵉᵒʷˎˊ˗
https://neko.cfd/
Other
158 stars 27 forks source link

Norms and residuals #1273

Open MartinKarp opened 1 month ago

MartinKarp commented 1 month ago

The intent of this issue is to have somewhere to discuss what an appropriate norm and stopping criterion for our Krylov solvers would be. Upon request, I compile here a short summary of what systems we are solving and what is currently in the code.

For the residuals for the stopping criterion in our linear solvers, we do not solve for the full system at any point in time, but for the change in time, $\Delta x = x^i-x^{i-1}$ where $x$ is either the pressure or velocity and $i$ corresponds to the value at time step $i$.

For some linear system $Ax=b$ (Poisson equation for the pressure and Helmholtz for the velocity), we transform the system to the form $$Ax^i = b^i => A\Delta x = b^i -A x^{i-1} $$ before it enters our Krylov solvers (usually GMRES for the pressure and CG for the velocity).

This means the magnitude of the starting rhs $b^i -A x^{i-1}$ scales with the time step $\Delta t$ and the magnitude of $x$.

This discussion is created to assess what norm we would like to minimize when solving our system, and what criterion we should have for $$||A\Delta x - b^i +A x^{i-1} || =||Ax^i - b^i|| < tol.$$

The choice of our norm $||x||$ for this stopping criterion is important as we would like it to preferably be independent of the number of grid points, the length of the time step, the volume of the domain, and the normalization of the flow.

Currently, the following norms might be considered:

  1. The L2 norm normalized with the volume of the domain $||x||_2/\sqrt{vol} = \sqrt{(x \cdot x)/vol}$ (this is what is used currently in the code).
  2. The standard L2 norm $||x||_2$.
  3. The integral (maybe the wrong word) over the domain divided by the volume $\sqrt{(x^T B x)/vol}$ where $B$ is the mass matrix.

What has been noted in all these cases (except the integral) is that the norm scales with the number of grid points and for all of them also that the time step length plays an important role in choosing an appropriate tolerance. @timofeymukha had some thoughts on this and asked me to start a discussion about it here.

adperezm commented 1 month ago

Good to bring it up.

I would say that 3rd option sounds right, at least for controlling the dependence with the number of gridpoints. My reasoning for liking that one is given below, but you do not need to read :)

I did not realize that $\Delta t$ also influenced it. But of course it does :). For that one I do not have opinions or suggestions.

=============================== Why I like the 3rd ==============================================

If one calls the residual: $r = A\Delta x - b^i +A x^{i-1} $

One metric could be simply: the average of the residual over the full domain $\Omega$ (or volume V). To get an average, you need:

$\langle r \rangle{V} = \frac{1}{V} \int{V} r dV = \frac{1}{V} \mathbf{r^{T} B}$ (Abusing notation here)

Doing this then gives a clear intuition on the value reported, i.e., an average.

A more strict metric to avoid caring about signs is the root mean square "residual" which is the 3rd one you mention. So we get the average of the square and take the square root:

$rms(r) = \sqrt{\frac{1}{V} \int_{V} r^{2} dV} = \sqrt{\frac{1}{V} \mathbf{r^{T} Br}}$

So once again, this is a quantity that gives a clear indication on what I should expect when I set the tol in the case file. Note that option number 1 looks like this one, but you do not get an average inside the square root.

One that perhaps could be good to have is the $\vert\vert r \vert\vert_{\infty} = max(\mathbf{r})$. For this one we do not need weights, but it is because we do not perform any sum getting this one.

I would say that any measure that is unbounded is perhaps not so nice and would advocate for using the weights.

timofeymukha commented 1 month ago

These videos can perhaps serve as some inspiration. https://youtu.be/v9OnNeYH4Ok?si=9Lx0Ye8Tdv8rqclV

adperezm commented 1 month ago

I guess with the end of that video one can follow up on what I wrote.

One should either get, an average, a rms or an infty norm.

When I wrote it in my previous comment, I suggest we calculate averages over the volume, given we can nicely get proper integrals in sem.

The other option is of course that one does not use weights, but then scales with the number of points instead of the Volume. This way each point in the domain has the same weight (which might be desirable), but one is still performing a proper averaging operation.

The problem we have currently (option 1) is that we are mixing the methods in a way, so we are not really getting any proper form of averaging.

I would still lean towards the 3rd option that Martin proposed, however I see the value of abandoning the mass and volume altogether and just using the number of points.

timofeymukha commented 1 month ago

A few comments:

MartinKarp commented 1 month ago

I guess that we solve for $\Delta x$ does not make a difference, but the rhs $b^i$ will be dependent on the length of the time step so it should matter I think.

I think regarding the scaling it is ok for the non-dimensionalized case I guess?

As for scaling by the number of points I think the integral over the volume somehow makes more sense.

timofeymukha commented 1 month ago

I think the timestep will affect the iteration numbers needed to achieve the given tolerance, but the procedure to evaluate the residual does not depend on the timestep per se.

As you say, for a non-dimensional solver the scaling is not needed, but in our case we should have it. There can be an option to select the scale manually as well to e.g. set it to 1.

timofeymukha commented 1 month ago

FYI https://www.openfoam.com/documentation/guides/latest/doc/guide-solvers-residuals.html Somewhat more detailed here https://doc.cfd.direct/notes/cfd-general-principles/residual

pschlatt1 commented 1 month ago

does any of you understand the meaning of the "average" solution? To me that sounds a bit strange.

timofeymukha commented 1 month ago

@pschlatt1 I took at face value, i.e. as the average over the dofs. But we could try to find what happens in the code.

adperezm commented 1 month ago

@pschlatt1 I took at face value, i.e. as the average over the dofs. But we could try to find what happens in the code.

Meanwhile I was sure that it was an average in time. Since they apply the linear operator to it, i.e., $A\dot{\bar{x}}$.

And I thought their initial guess is that average in time since in their plot they start from $r=1$

So I guess as tim says, one should try. But what I would like someone to explain is what is the benefit of doing this one? It is not obvious to me from the links.

timofeymukha commented 1 month ago

I think it cannot be in time since the time averages are not necessarily even computed in OpenFOAM unless you tell it you want them. The refer to $x = \bar x$ as a uniform system, I figured that means the same value everywhere. But we should look in the code to make sure. I don't fully understand how this expression works either, but I guess it is worth figuring it out.

adperezm commented 1 month ago

I think it cannot be in time since the time averages are not necessarily even computed in OpenFOAM unless you tell it you want them. The refer to $x = \bar x$ as a uniform system, I figured that means the same value everywhere. But we should look in the code to make sure. I don't fully understand how this expression works either, but I guess it is worth figuring it out.

Ahaa, okay, makes sense. Thanks, silly me.

pschlatt1 commented 1 month ago

indeed, it really seems to just be the mean value, which will give you some sort of base value for the normalisation. But I miss for instance the mass matrix. I think we could try something like that; at least for a simple scaling of the velocity it would successfully remove. Is there perhaps a reference to some paper for that method?