Open Yixiao-Zhang opened 6 months ago
I add a bit of context here:
@Yixiao-Zhang is doing a non-hydrostatic simulation with an immersed boundary and he finds that the code crashes with the PCG solver. That is not necessarily connected with the PCG solver but it might be caused by simulation setup or other issues. Since the pressure solver used is experimental (from this branch), as a way to assess where the crashing comes from, I suggested using another method to see if the crash would also happen, which would validate or not the experimental pressure solver.
@Yixiao-Zhang, optimizing GPU preconditioners is a quite difficult task as demonstrated by the preconditioners slowing down the simulation, and probably not a good use of time of trying to figure out a way to speed up these solvers that we are not sure we want to use. Since this simple attempt to have a simulation that runs (with another correct solver) up to the crashing point does not work, I would suggest to just trying to use a (slightly non-correct) FFT pressure solver to see if the simulation still crashes. If not, then we can assume the crashing occurs due to the PCG solver and try to catch the bug. This will probably be a better way to "optimize" the solver since we know that the PCG preconditioned with FFT is faster than these other methods.
BTW we can use this issue also to track the speed of the immersed PCG solver if you have some results
Shouldn't we try the new PCG solver with FFT-based preconditioner? This works pretty well. You can limit max_iterations
to 2 or 3 if you want.
Thank you for your replies!
The FFT-based preconditioner was the first one that I tried. It produced suspicious zonal jets in my simulations. This figure shows the difference between the two solvers. It seems to me that these zonal jets are numerical artifacts.
I think the FFT-based preconditioner is the simulation on the right, which if I remember correctly crashes after a while. The FFT solver on the right is just a FFT based Poisson solver that does not account for the immersed boundary but runs stably (am I correct?)
The reason to try another method was indeed to check if the crash is caused by the PCG solver or by another setting
Thank you for your replies!
The FFT-based preconditioner was the first one that I tried. It produced suspicious zonal jets in my simulations. This figure shows the difference between the two solvers. It seems to me that these zonal jets are numerical artifacts.
Can you clarify --- is the simulation on the right with the FFT-based direct solver, or is it with a preconditioned conjugate gradient solver that use an FFT as a preconditioner?
My suggestion is to use a preconditioned conjugate gradient solver, with the FFT-based solver as a preconditioner (not as the direct solver).
As for blow up I think the problem happens for very small time-steps? Perhaps try it without adaptive time-stepping for now and also cap the max_iterations
as a small number.
The FFT solver on the right is just a FFT based Poisson solver that does not account for the immersed boundary but runs stably (am I correct?)
You mean left
I made a simple script for testing, and it takes 3 minutes to run on my PC (on either CPU or GPU). This is a 2D simulation initialized with a lateral buoyancy gradient. The top is tilted. The figure shows the comparison between the default solver and the HeptadiagonalIterativeSolver
. The default FFT solver produces pixelated patterns near the top boundary and deep zonal jets in the ocean interior.
Besides, I tired the HeptadiagonalIterativeSolver
with the FFT-based solver as a preconditioner. It did not crash for this script and produced almost the same as the HeptadiagonalIterativeSolver
with no preconditioner.
using Printf
using Oceananigans
using Oceananigans.Models.NonhydrostaticModels: ImmersedPoissonSolver
# ---------------------------------------------------------------------- #
# Define Parameters
# Numerical Technic
const arch = CPU()
const time_stepper = :RungeKutta3
const advection = WENO()
# Grid
const Nx = 1
const Ny = 200
const Nz = 50
const Lx = 100.0e3
const Ly = 200.0e3
const Lz = 50.0e3
const Δz = Lz / 2 # elevation difference at the top
# Time Stepping
const Δt = 1800.0
# Physical Parameters
const diffusivity = 1.0e-4
const Pr = 1.0
const f₀ = 1.0e-4
const Δb = 1.0e-6 # buoyancy difference at the top
# Output
const output_interval = 1
const deflatelevel = 4
# ---------------------------------------------------------------------- #
# Define Utils
# Height at Top
@inline function z_top(y::R) where {R<:Real}
return Lz - (Δz / Ly) * y
end
# Viscosity
const viscosity = Pr * diffusivity
# Initial Fields
@inline function b_initial(x::R, y::R, z::R) where {R<:Real}
ϵ = 100 * eps(R)
return (Δb / Ly) * y + randn() * ϵ
end
# ---------------------------------------------------------------------- #
# Define the Simulation
# Grid
ib_grid = begin
underlying_grid = RectilinearGrid(
arch,
size = (Nx, Ny, Nz),
x = (-Lx / 2, Lx / 2),
y = (0.0, Ly),
z = (0.0, Lz),
topology = (Periodic, Bounded, Bounded),
halo = (4, 4, 4),
)
@inline function is_ib(x::R, y::R, z::R) where {R<:Real}
return z > z_top(y)
end
ImmersedBoundaryGrid(
underlying_grid,
GridFittedBoundary(is_ib)
)
end
# Coriolis
coriolis = FPlane(f₀)
# Buoyancy
buoyancy = BuoyancyTracer()
# Closure
closure = ScalarDiffusivity(ν = viscosity, κ = diffusivity)
# Pressure Solver
pressure_solver = ImmersedPoissonSolver(
ib_grid,
solver_method = :HeptadiagonalIterativeSolver,
reltol = 1e-8,
verbose = false
)
# Model
model = NonhydrostaticModel(;
grid = ib_grid,
timestepper = time_stepper,
advection = advection,
tracers = (:b, ),
coriolis = coriolis,
buoyancy = buoyancy,
closure = closure,
pressure_solver = pressure_solver,
)
# Initial Value
set!(model, b = b_initial)
# Simulation
simulation = Simulation(model; Δt = Δt, stop_iteration = 1000)
# Set Output
output_fields = merge(model.velocities, model.tracers)
simulation.output_writers[:output_3d] = NetCDFOutputWriter(
model,
output_fields,
filename = "output.nc",
schedule = IterationInterval(output_interval),
overwrite_existing = true,
deflatelevel = deflatelevel,
)
# Set Progress Function
function print_simulation_progress(simulation)
model = simulation.model
i, t = model.clock.iteration, model.clock.time
@info Printf.@sprintf("Iteration: %d, Time: %.2f Earth Days", i, t / 86400.0)
return nothing
end
simulation.callbacks[:progress] = Callback(
print_simulation_progress,
IterationInterval(100),
)
# ---------------------------------------------------------------------- #
# Run the Simulation
run!(simulation)
Thank you for reply, @glwagner!
Can you clarify --- is the simulation on the right with the FFT-based direct solver, or is it with a preconditioned conjugate gradient solver that use an FFT as a preconditioner?
On the left is FFT-based direct solver. On the right is the PCG solver with the FFT-based solver as a preconditioner.
My suggestion is to use a preconditioned conjugate gradient solver, with the FFT-based solver as a preconditioner (not as the direct solver).
It is what the right panel shows, if I am not mistaken. But the simulation crashed after thousands of iterations. I heard that the PCG solver in Oceananigans has not been widely tested, so that is why I turned to the HeptadiagonalIterativeSolver
.
As for blow up I think the problem happens for very small time-steps? Perhaps try it without adaptive time-stepping for now and also cap the
max_iterations
as a small number.
I am doing more testing on this. It is a different issue though. I will open a new issue if I can find a simple way to reproduce the blow up.
It is what the right panel shows, if I am not mistaken. But the simulation crashed after thousands of iterations. I heard that the PCG solver in Oceananigans has not been widely tested, so that is why I turned to the HeptadiagonalIterativeSolver.
Both of those solvers actually use the preconditioned conjugate gradient method. It's also not true --- the PreconditionedConjugateGradientSolver
has been validated. I'm not even sure it's possible to use the FFT-based preconditioner with the heptadiagonal solver, they have different interfaces. Maybe you worked on that.
It's not obvious how to generalize the HeptadiagonalIterativeSolver
to support Distributed
architecture, and its also likely more difficult to optimize for immersed boundary methods using an active cells map.
We shouldn't waste our time with the HeptadiagonalIterativeSolver
. If the PreconditionedConjugateGradientSolver
has issues, we should fix them. It's a waste of energy to work on both.
I am trying to run simulations with
ImmersedPoissonSolver
in this branch as suggested by @simone-silvestri. It turned out that usingImmersedPoissonSolver
makes the simulation tens of times slower than the original simulation without using an immersed boundary condition. Besides. usingAsymptoticInverse
orILUFactorization
preconditioner makes the simulation even slower.Here are the details of my runs. I did three runs. The domain size is 170 x 1320 x 50. In the first run, I used no preconditioner, and the pressure solver is
In the second run, I used the
AsymptoticInverse
preconditioner:In the third run, I used the
ILUFactorization
preconditioner:I ran each simulation on a NVIDIA V100 GPU (on MIT Satori) for 4 hours and calculated the number of iterations in unit time. As a result, the speeds of these three simulations are 134, 62, and 29 iterations per hour respectively. The full scripts for three runs and the corresponding output logs are given here:
AsymptoticInverse
preconditioner: run_a2.tar.gzILUFactorization
preconditioner: run_a3.tar.gzVersion info:
dev/Oceananigans
this branch