[coriolis_acceleration] Probable error in the formulation of the Coriolis acceleration term and/or the benchmark problem.

j507 commented 3 years ago

The Christensen.cc file does not produce the results of the benchmark. The fields computing during the first step differ somewhat from the snapshots you provided. Comparison The first row belong to your VeryShortRun snapshots while the second row correspond to my latest run on my laptop. The columns are the velocity, the pressure and the temperature field respectively.

At the 9th step one can already see the angular motion in the velocity field of VeryShortRun, while on mine it is nowhere to be seen Comparison2

On the 5608th step of my latest simulation they look like this Comparison3 Although the distribution of the field does not really change after the 200th time step. It looks as if the Coriolis term is indeed present as one can (barely) appreciate in the pressure field but it is apparently too weak.

sebglane commented 3 years ago

Although I would rate it unlikely, I could imagine is that there is some difference between 2D or 3D. The snapshots I have sent to you are from a 3D run. We are running the problem in 2D. One thing I could do is to dig up my old code and see whether the Coriolis acceleration was visible in my 2D run. I am gathering some snapshots of the old code right now to check this.

Other than that. you could add simple print when the Coriolis term is assembled just to check that it is active...

j507 commented 3 years ago

Other than that. you could add simple print when the Coriolis term is assembled just to check that it is active...

Yes, I had already checked this by adding

      if (angular_velocity_vector_ptr != nullptr)
      {
        static unsigned int test = 0;
        if (test == 0 && time_stepping.get_step_number() == 1)
        {
          *pcout << std::endl
                 << "beta[0]                                    = "
                 << beta[0] << std::endl
                 << "beta[1]                                    = "
                 << beta[1] << std::endl
                 << "parameters.C1                              = "
                 << parameters.C1 << std::endl
                 << "scratch.phi[i]                             = "
                 << scratch.phi[i] << std::endl
                 << "scratch.old_angular_velocity_value[0]      = "
                 << scratch.old_angular_velocity_value[0] << std::endl
                 << "scratch.old_old_angular_velocity_value[0]  = "
                 << scratch.old_old_angular_velocity_value[0] << std::endl
                 << "scratch.old_velocity_values[q]             = "
                 << scratch.old_velocity_values[q] << std::endl
                 << "scratch.old_old_velocity_values[q]         = "
                 << scratch.old_old_velocity_values[q] << std::endl << std::endl;
          test++;
        }

        if constexpr(dim == 2)
          // The minus sign in the argument of cross_product_2d
          // method is due to how the method is defined.
          data.local_rhs(i) -=
        ....

which prints

beta[0]                                    = 2.000000e+00
beta[1]                                    = -1.000000e+00
parameters.C1                              = 2.000000e+03
scratch.phi[i]                             = 4.723790e-01 0.000000e+00
scratch.old_angular_velocity_value[0]      = 1.000000e+00
scratch.old_old_angular_velocity_value[0]  = 1.000000e+00
scratch.old_velocity_values[q]             = 2.935302e-03 9.304189e-06
scratch.old_old_velocity_values[q]         = 0.000000e+00 0.000000e+00

sebglane commented 3 years ago

This tubcloud folder contains all 2D simulations which I did previously. /Ekman_1e-3_recovered/explicit_dt_1e-5_longrun/ contains some results which are suitable for a comparison.

j507 commented 3 years ago

The field as seen in that folder, `/Ekman_1e-3_recovered/explicit_dt_1e-5_longrun/``coincide in form and magnitude with my results. When you ran your code in 3D, did your results behave as those from the Rayleigh solver?

sebglane commented 3 years ago

I will upload the results of my 3D runs as well but this takes a while. In one of the runs the solution is rotating. Do you also see this rotation in your solutions?

j507 commented 3 years ago

After seeing your earlier 2D results, I ran a test which strongly indicates that there is indeed a difference between the 2D and 3D case. First row are the results of my latest 3D run, while the second row are the results in VeryShortRun. The first, second and third column are the velocity magnitude, the pressure and the temperature respectively.

Comparison

Relevant parameters

+------------------------------------------+----------------------+
| Problem parameters                                              |
+------------------------------------------+----------------------+
| Problem type                             | rotating_boussinesq  |
| Spatial dimension                        | 3                    |
| Mapping                                  | MappingQ<3>(2)       |
| Mapping - Apply to interior cells        | true                 |
| Finite Element - Velocity                | FE_Q<3>(2)^3         |
| Finite Element - Pressure                | FE_Q<3>(1)           |
| Finite Element - Temperature             | FE_Q<3>(2)           |
| Verbose                                  | false                |
+------------------------------------------+----------------------+
| Dimensionless numbers                                           |
+------------------------------------------+----------------------+
| Prandtl number                           | 1                    |
| Rayleigh number                          | 100000               |
| Ekman number                             | 0.001                |
+------------------------------------------+----------------------+
| Timestepping parameters                                         |
+------------------------------------------+----------------------+
| IMEX scheme                              | BDF2                 |
| Maximum number of time steps             | 10                   |
| Adaptive timestepping                    | false                |
| Initial time step                        | 0.0001               |
| Start time                               | 0                    |
| Final time                               | 0.001                |
| Verbose                                  | true                 |
+------------------------------------------+----------------------+
| Navier-Stokes discretization parameters                         |
+------------------------------------------+----------------------+
| Incremental pressure-correction scheme   | rotational           |
| Convective term weak form                | skew-symmetric       |
| Convective temporal form                 | semi-implicit        |
| Preconditioner update frequency          | 10                   |
+------------------------------------------+----------------------+
| Linear solver parameters - Diffusion step                       |
+------------------------------------------+----------------------+
| Maximum number of iterations             | 1000                 |
| Relative tolerance                       | 1e-06                |
| Absolute tolerance                       | 1e-09                |
| Preconditioner                           | ILU                  |
|   Fill-in level                          | 1                    |
|   Overlap                                | 1                    |
|   Relative tolerance                     | 1                    |
|   Absolute tolerance                     | 0                    |
+------------------------------------------+----------------------+
| Linear solver parameters - Projection step                      |
+------------------------------------------+----------------------+
| Maximum number of iterations             | 1000                 |
| Relative tolerance                       | 1e-06                |
| Absolute tolerance                       | 1e-09                |
| Preconditioner                           | ILU                  |
|   Fill-in level                          | 1                    |
|   Overlap                                | 1                    |
|   Relative tolerance                     | 1                    |
|   Absolute tolerance                     | 0                    |
+------------------------------------------+----------------------+
| Linear solver parameters - Correction step                      |
+------------------------------------------+----------------------+
| Maximum number of iterations             | 1000                 |
| Relative tolerance                       | 1e-06                |
| Absolute tolerance                       | 1e-09                |
| Preconditioner                           | ILU                  |
|   Fill-in level                          | 1                    |
|   Overlap                                | 1                    |
|   Relative tolerance                     | 1                    |
|   Absolute tolerance                     | 0                    |
+------------------------------------------+----------------------+
| Linear solver parameters - Poisson pre-step                     |
+------------------------------------------+----------------------+
| Maximum number of iterations             | 1000                 |
| Relative tolerance                       | 1e-06                |
| Absolute tolerance                       | 1e-09                |
| Preconditioner                           | ILU                  |
|   Fill-in level                          | 1                    |
|   Overlap                                | 1                    |
|   Relative tolerance                     | 1                    |
|   Absolute tolerance                     | 0                    |
+------------------------------------------+----------------------+
+------------------------------------------+----------------------+
| Heat equation solver parameters                                 |
+------------------------------------------+----------------------+
| Convective term weak form                | skew-symmetric       |
| Convective temporal form                 | semi-implicit        |
| Preconditioner update frequency          | 10                   |
+------------------------------------------+----------------------+
| Linear solver parameters - Heat equation                        |
+------------------------------------------+----------------------+
| Maximum number of iterations             | 1000                 |
| Relative tolerance                       | 1e-06                |
| Absolute tolerance                       | 1e-09                |
| Preconditioner                           | ILU                  |
|   Fill-in level                          | 1                    |
|   Overlap                                | 1                    |
|   Relative tolerance                     | 1                    |
|   Absolute tolerance                     | 0                    |
+------------------------------------------+----------------------+

+----------+----------+----------+----------+----------+----------+
|    C1    |    C2    |    C3    |    C4    |    C5    |    C6    |
+----------+----------+----------+----------+----------+----------+
|  2.0e+03 |  1.0e+00 |  1.0e+05 |  1.0e+00 |  0.0e+00 |  1.0e+03 |
+----------+----------+----------+----------+----------+----------+

Triangulation:
 Number of initial active cells           = 132096

Spatial discretization:
 Number of velocity degrees of freedom    = 3880182
 Number of pressure degrees of freedom    = 179754
 Number of temperature degrees of freedom = 1293394
 Number of total degrees of freedom       = 5353330

The verbose output of the last step

  Heat Equation: Assembling advection matrix... done!
  Heat Equation: Assembling right hand side... done!
    Right-hand side's L2-norm = 1.233147e+02
  Heat Equation: Solving linear system... done!
    Number of GMRES iterations: 735, Final residual: 1.226188e-04.

  Navier Stokes: Assembling advection matrix... done!
  Navier Stokes: Assembling diffusion step's right hand side... done!
    Right-hand side's L2-norm = 6.555920e+02
  Navier Stokes: Solving the diffusion step... done!
    Number of GMRES iterations: 553, Final residual: 6.479068e-04.
  Navier Stokes: Assembling the projection step's right hand side... done!
    Right-hand side's L2-norm = 9.725863e-02
  Navier Stokes: Solving the projection step... done!
    Number of CG iterations: 113, Final residual: 8.129867e-08.
  Navier Stokes: Pressure correction step... done!
    Number of CG iterations: 18, Final residual: 4.764454e-09.

There are three fronts to tackle:

Fine tuning of the preconditioners
Computational power: For this step I used 32 processors and a mesh, on which I would rather do one more global refinement as the velocity field looks somewhat under resolved. Left is the mesh of the lastest 3D run and right those of VeryShortRun

ComparisonMesh

I know that the meshes are not really comparable as they use different methods but its a reference still. Remember that the graphical output globally refines the mesh once. The actual mesh is one level coarser. Anyhow, with the current mesh each step takes 10 mins and with a time step of 1e-4 and a end time of 1.2 we are talking about 2000 hours. Hopefully a better tuning of the preconditioners reduces this considerably but I think we need more firepower if test cases are to be done in 3D.

The benchmark's computation have to be rewritten using the WorkStream approach. The last output file was created at 14:30 and the benchmarks computation are still running (It is 17:11 as I am writing this).

sebglane commented 3 years ago

The results look great! To be honest I didn't expect this. This is promising.

If you need 10 minutes for one time step 10,000 time step would take approx. 70 days which is too much. There are several options to reduce the runtime:

Compile in relase mode make release.
Switch to AMG preconditioning. In AMG preconditioning the number of iterations is constant for an increasing number of degrees of freedom whereas it is increasing for ILU preconditioning.
Possibly switch to purely explicit scheme because the advection matrix takes most of the time, right?

I don't understand your comment regarding the WorkStream approach? We are using this approach already.

Could you make a short 3D run with 10 time steps such that we can see the runtime of the different sections?

j507 commented 3 years ago

The results look great! To be honest I didn't expect this. This is promising.

If you need 10 minutes for one time step 10,000 time step would take approx. 70 days which is too much. There are several options to reduce the runtime:
1. Compile in relase mode `make release`.

I keep forgetting that release mode exits...

2. Switch to AMG preconditioning. In AMG preconditioning the number of iterations is constant for an increasing number of degrees of freedom whereas it is increasing for ILU preconditioning.

Will do

3. Possibly switch to purely explicit scheme because the advection matrix takes most of the time, right?

Most likely yeah. The benchmark data is still computing. I will update once I have the terminal output with the times spent in each section.

I don't understand your comment regarding the WorkStream approach? We are using this approach already.

The benchmark's class. The methods computing the mean velocity and such are not coded witih the WorkStream approach. It had not be needed until now.

Could you make a short 3D run with 10 time steps such that we can see the runtime of the different sections?

This prm file of this run was set for 10 steps. I'm currently uploading the results to my tubcloud

Edit: Link updated

sebglane commented 3 years ago

Could you also provide the parameter file? Are you running on coriolis_plus_master or coriolis_acceleration?

sebglane commented 3 years ago

I started a production on coriolis_acceleration with the following input file:

# ---------------------
set FE's polynomial degree - Pressure (Taylor-Hood) = 1
set FE's polynomial degree - Temperature            = 2
set Mapping - Apply to interior cells               = true
set Mapping - Polynomial degree                     = 2
set Problem type                                    = rotating_boussinesq
set Spatial dimension                               = 3
set Verbose                                         = false

subsection Dimensionless numbers
  set Ekman number            = 1e-3
  set Prandtl number          = 1.0
  set Rayleigh number         = 1e5
end

subsection Heat equation solver parameters
  set Convective term time discretization = semi-implicit
  set Convective term weak form           = skew-symmetric
  set Preconditioner update frequency     = 10
  set Verbose                             = false

  subsection Linear solver parameters
    set Absolute tolerance           = 1e-9
    set Maximum number of iterations = 1000
    set Relative tolerance           = 1e-6

    subsection Preconditioner parameters
      set Preconditioner type      = AMG
      set Aggregation threshold    = 1.0e-4
      set Elliptic                 = true
      set Number of cycles         = 1
    end

  end

end

subsection Navier-Stokes solver parameters
  set Convective term time discretization    = semi-implicit
  set Convective term weak form              = skew-symmetric
  set Incremental pressure-correction scheme = rotational
  set Preconditioner update frequency        = 10
  set Verbose                                = false

  subsection Linear solver parameters - Correction step
    set Absolute tolerance           = 1e-9
    set Maximum number of iterations = 1000
    set Relative tolerance           = 1e-6

    subsection Preconditioner parameters
      set Preconditioner type      = Jacobi
      set Number of sweeps         = 10
      set Overlap                  = 1
      set Relaxation parameter     = 0.5
    end

  end

  subsection Linear solver parameters - Diffusion step
    set Absolute tolerance           = 1e-9
    set Maximum number of iterations = 1000
    set Relative tolerance           = 1e-6

    subsection Preconditioner parameters
      set Preconditioner type      = AMG
      set Aggregation threshold    = 1.0e-4
      set Elliptic                 = true
      set Number of cycles         = 1
    end

  end

  subsection Linear solver parameters - Poisson pre-step
    set Absolute tolerance           = 1e-9
    set Maximum number of iterations = 1000
    set Relative tolerance           = 1e-6

    subsection Preconditioner parameters
      set Preconditioner type      = AMG
      set Aggregation threshold    = 1.0e-4
      set Elliptic                 = true
      set Number of cycles         = 1
    end

  end

  subsection Linear solver parameters - Projection step
    set Absolute tolerance           = 1e-9
    set Maximum number of iterations = 1000
    set Relative tolerance           = 1e-6

    subsection Preconditioner parameters
      set Preconditioner type      = AMG
      set Aggregation threshold    = 1.0e-4
      set Elliptic                 = true
      set Number of cycles         = 1
    end

  end

end

subsection Output control parameters
  set Graphical output directory = ChristensenResults/
  set Graphical output frequency = 100
  set Terminal output frequency  = 10
end

subsection Refinement control parameters
  set Adaptive mesh refinement               = false
  set Adaptive mesh refinement frequency     = 100
  set Fraction of cells set to coarsen       = 0.3
  set Fraction of cells set to refine        = 0.03
  set Maximum number of levels               = 12 
  set Minimum number of levels               = 0
  set Number of initial adaptive refinements = 0
  set Number of initial boundary refinements = 1
  set Number of initial global refinements   = 8
end

subsection Time stepping parameters
  set Adaptive time stepping        = false
  set Adaptive timestepping barrier = 2
  set Final time                    = 1.0
  set Initial time step             = 1e-4
  set Maximum number of time steps  = 10000
  set Maximum time step             = 7e-2
  set Minimum time step             = 2e-2
  set Start time                    = 0.0
  set Time stepping scheme          = BDF2
  set Verbose                       = true
end

j507 commented 3 years ago

Could you also provide the parameter file? Are you running on coriolis_plus_master or coriolis_acceleration?

I did not save that specific parameters file. Nonetheless, one should be able to reproduce it with the terminal output from my previous commentary.

I already deleted coriolis_plus_master. Only coriolis_acceleration is valid from here on now.

j507 commented 3 years ago

I started a production on coriolis_acceleration with the following input file:

@sebglane Do notice, there is an infinite loop in the benchmark request class when running in 3D! I am finishing up a couple of commits and will look for the loop afterwards.

j507 commented 3 years ago

I just did a test with a smaller 3D problem

 Number of initial active cells           = 8448

Spatial discretization:
 Number of velocity degrees of freedom    = 242046
 Number of pressure degrees of freedom    = 11158
 Number of temperature degrees of freedom = 80682
 Number of total degrees of freedom       = 333886

and the benchmark request was computed with no trouble. I assumed there was an endless loop as on the run I refer in my previous commentary, hours went by without the benchmark finishing its computation. And I mean the benchmark was taking way longer than the actual time loop took.

j507 commented 3 years ago

I am still trying to decipher what is wrong with the benchmark request. I ran the same problem size in the Benz server with 10 cores and it does not finish computing the benchmark data, but when run with 5 cores and the code runs perfectly. Either the assembly process scales really badly or a command is not thread-safe? I will code the WorkStream approach and hope it fixes this.

Update: The problem starts with 6 cores. I coded a terminal output

static unsigned int counting = 0;
std::cout << "Testing (" << Utilities::MPI::this_mpi_process(mpi_communicator) << ") :" << counting << std::endl;
counting++;

inside the cell iterator of the compute_global_data method. All processors reach a count of 1407, i.e. Testing (1) :1407, and they just stop. Nothing else gets prints and the program gets stuck.

j507 commented 3 years ago

Update: With the WorkStream approach implemented in 641da09 the benchmark seems to not get stuck anymore. I tested it with 10 processors on the benz server.

j507 commented 3 years ago

I started a production on coriolis_acceleration with the following input file:

What were the results of this run? I doubt the benchmark data was computed but the graphical output would be a first indication that everything is alright with the implementation.

sebglane commented 3 years ago

I have deactivate all post-processing and only write vtk output. The run has finished and produced approx. 45 GB of data. A few snapshots are available at here.

j507 commented 3 years ago

I get a file not found error from the link

sebglane commented 3 years ago

I moved the data to tubcloud/Shared/ChristensenResults/LongRun. I hope you can access them.

j507 commented 3 years ago

They look good. The overall physics seems to be correctly reproduced. We need to proof for accuracy now. How long did the simulation take?

I think we can close this issue now.

sebglane commented 3 years ago

It ran from Sunday to Wednesday on 16 MPI processes.

Please close this issue.

I will upload more Snapshots and the log files to the cloud.

LKM-code-base / RotatingMHD

[coriolis_acceleration] Probable error in the formulation of the Coriolis acceleration term and/or the benchmark problem. #85