hdrake commented 1 year ago

Presently, applying differential operators to fields offline (as opposed to using diagnosing them online using using an OutputWriter) yields erroneous results because derivatives seem to be naively using output halo region values (which seem to be filled with zeroes by default) and not overwriting them to satisfy boundary conditions.

One example impact of this is that the Nusselt number calculation in the horizontal_convection.jl example script is totally meaningless because it is dominated by spuriously large buoyancy gradients in the boundary-adjacent cells.

@ikeshwani and I demonstrate this bug in this horizontal_diffusion.jl script, in which we turn off advection in the horizontal_convection.jl example and numerically integrate the solution to equilibrium. We compare timeseries of the volume-integrated buoyancy dissipation rates calculated online versus those calculated offline (as in the horizontal_convection.jl example). The results show that the online calculation correctly asymptotes to the numerical solution of the equilibrium boundary value problem while the offline calculation is erroneous and effectively yields a Nusselt number that is more than 6 times too high.

equilibration_ratio

The bug is also evident by comparing snapshots of the two buoyancy dissipation rate fields. The dissipation rates computed offline clearly do not satisfy the no-flux boundary conditions on the boundaries.

This bug is present in the live main Oceananigans.jl branch (circa v0.86.0), as is evident from the movie of the buoyancy dissipation rate field in the horizontal_convection.jl example documentation and verified locally.

I am referring to this as a bug because it is contrary to the expected behavior of halos containing the necessary information for satisfying boundary conditions, as discussed in the horizontal convection documentation example: https://github.com/CliMA/Oceananigans.jl/blob/a226b3efa7db7426ccee03884d610035314955e4/examples/horizontal_convection.jl#L143-L147

hdrake commented 1 year ago

This bug is distinct from https://github.com/CliMA/Oceananigans.jl/issues/3158 because it does not only affect immersed boundaries and I think is more fundamental than https://github.com/CliMA/Oceananigans.jl/issues/3082.

hdrake commented 1 year ago

Maybe there is just a missing call to fill_halo_regions!? See https://github.com/CliMA/Oceananigans.jl/issues/2442?

glwagner commented 1 year ago

Hmm yes, the halo regions are essentially never filled by default. During time-stepping they are filled at the conclusion of a time-step within update_state!. But offline, you'll generally have to manually call fill_halo_regions!...

For offline diagnostics, a better approach for the horizontal convection example is to save halo regions using with_halos=true when building the output writer, I think.

We've discussed making with_halos = true the default. The problem is that having halo regions is inconvenient for people who do not use FieldTimeSeries. So this is a trade-off; we can favor the users who prefer FieldTimeSeries by making with_halos = true.

Usability suggestions like places where we should fill halo regions by default are also welcome! We have debated adding that to set! for example, but @simone-silvestri has pointed out that for very large, distributed problems, one needs to be cautious about calling fill_halo_regions! as it triggers communication. But for applications which are guaranteed to only involve post-processing, I don't think this is an issue...

hdrake commented 1 year ago

a better approach for the horizontal convection example is to save halo regions using with_halos=true when building the output writer

Maybe I am misunderstanding you, but isn't that already what is done in the horizontal convection example? https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/examples/horizontal_convection.jl#L153-L157

It seems like this is indeed saving grids with halos, but erroneously filling them with zeros rather than the correct values? Is the intended behavior that output writers automatically call fill_halo_region! before saving when with_halos=true? That would be an intuitive enough API, even if the default was still with_halos=false, but I think separate from this issue.

glwagner commented 1 year ago

Maybe I am misunderstanding you, but isn't that already what is done in the horizontal convection example?

No, you are understanding me! I think you're on to something.

Is the intended behavior that output writers automatically call fill_halo_region! before saving when with_halos=true?

Ah no this is not default. However, for prognostic fields, the halos are filled within update_state!:

https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/src/Models/NonhydrostaticModels/update_nonhydrostatic_model_state.jl#L24C1-L24C1

However, halos are not filled for diagnostic fields.

We probably don't want to make filling halos default, since filling halo regions is expensive and useful only for a subset of experiments. However, we could add a keyword argument to JLD2OutputWriter, something like fill_halos = true. (Thinking about this with a little more coffee, it probably doesn't make sense to add something like this, because generally speaking the halo values for diagnostic fields are not useful except for periodic boundary conditions; only prognostic fields can have specified / meaningful boundary conditions.)

I wonder if this is a bug in FieldTimeSeries. Are you using FieldTimeSeries to compute the diagnostics offline? Perhaps the halo data is saved correctly, but is not loaded correctly.

glwagner commented 1 year ago

@navidcy might be interested in this bug

glwagner commented 1 year ago

This illustrates the bug (slightly overpowered, because I was trying to figure out what's going wrong):

using Oceananigans
using GLMakie

grid = RectilinearGrid(size=3, halo=3, z=(0, 1), topology=(Flat, Flat, Bounded))

c_bottom_bc = ValueBoundaryCondition(1)
c_bcs = FieldBoundaryConditions(bottom=c_bottom_bc)
closure = ScalarDiffusivity(κ=1)

model = HydrostaticFreeSurfaceModel(; grid, closure,
                                    tracers = :c,
                                    buoyancy = nothing,
                                    boundary_conditions=(; c=c_bcs))

simulation = Simulation(model, Δt=1e-2, stop_iteration=100)

filename = "simple_tracer_output_test.jld2"
simulation.output_writers[:c] = JLD2OutputWriter(model, model.tracers; filename,
                                                 schedule = IterationInterval(1),
                                                 overwrite_existing = true,
                                                 with_halos = true)

# Evaluate c on boundary
using Oceananigans.Operators: ℑzᵃᵃᶠ

function show_boundary_c(sim)
    c = sim.model.tracers.c
    cb = ℑzᵃᵃᶠ(1, 1, 1, grid, c)
    @info string("Iter: ", iteration(sim), ", c(z=0): ", cb)
    return nothing
end

simulation.callbacks[:show] = Callback(show_boundary_c)

run!(simulation)

ct = FieldTimeSeries(filename, "c")

t = ct.times
grid = ct.grid
Nt = length(t)
cb = zeros(Nt)
for n = 1:Nt
    cn = ct[n]
    cb[n] = ℑzᵃᵃᶠ(1, 1, 1, grid, cn)
end

fig = Figure()
ax = Axis(fig[1, 1], xlabel="Iteration", ylabel="c")
lines!(ax, ct[1, 1, 0, :], label="c[0]")
lines!(ax, cb, label="c(z=0)")
lines!(ax, ct[1, 1, 1, :], label="c[1]")
axislegend(ax)
display(fig)

giving

yellow is c interpolated to the boundary --- which should be 1 always (as the show_boundary_c illustrates is true online). the blue is the halo value, which should not be 0.

glwagner commented 1 year ago

As an aside, I think this issue illustrates that users are indeed interested in being able to evaluate gradients across boundaries. This is important because @simone-silvestri proposed a change that would make this impossible (eg it has been proposed we do not fill halos for Value and Gradient boundary conditions, and instead evaluate the associated fluxes in the same way we do for immersed boundaries --- because this has performance advantages for very large models).

glwagner commented 1 year ago

Inspecting the file manually shows that the data is indeed correct in there, so the problem appears to be with FieldTimeSeries:

julia> using JLD2

julia> file = jldopen(filename)
JLDFile /Users/gregorywagner/Desktop/simple_tracer_output_test.jld2 (read-only)
 ├─📂 grid
 │  ├─🔢 Nx
 │  ├─🔢 Ny
 │  ├─🔢 Nz
 │  ├─🔢 Hx
 │  ├─🔢 Hy
 │  ├─🔢 Hz
 │  ├─🔢 Lx
 │  └─ ⋯ (14 more entries)
 └─ ⋯ (3 more entries)

julia> c_data = file["timeseries/c/0"][:]
9-element Vector{Float64}:
 0.0
 0.0
 2.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

julia> c_data = file["timeseries/c/1"][:]
9-element Vector{Float64}:
 0.0
 0.0
 1.82
 0.18
 0.0
 0.0
 0.0
 0.0
 0.0

The halo value across the bottom boundary is the third one down from the top. At iteration 0, it's value is 2 (so that interpolating between 0 and 2 returns 1). At iteration 1 it's value is 1.82, showing that it is changing correctly in time.

glwagner commented 1 year ago

Data is loaded into FieldTimeSeries by

https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/src/OutputReaders/field_time_series.jl#L146

which calls

https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/src/OutputReaders/field_time_series.jl#L200-L205

The data seems to be loaded into the intermediate Field:

https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/src/OutputReaders/field_time_series.jl#L237

so the problem may be

https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/src/OutputReaders/field_time_series.jl#L205

Voila...

https://github.com/CliMA/Oceananigans.jl/blob/cca182a11bcd1881e20316fc80ac7782286a8bfe/src/Fields/set!.jl#L43-L55

glwagner commented 1 year ago

Here's a more minimal reproducer of the core issue:

using Oceananigans
grid = RectilinearGrid(size=1, z=(0, 1), topology=(Flat, Flat, Bounded))
a = CenterField(grid)
b = CenterField(grid)
parent(a) .= 1
set!(b, a)

then

julia> parent(a)[:]
7-element Vector{Float64}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

julia> parent(b)[:]
7-element Vector{Float64}:
 0.0
 0.0
 0.0
 1.0
 0.0
 0.0
 0.0

glwagner commented 1 year ago

3225 seems to do the trick and the above example yields the plot

which is correct: the value of c interpolated on the boudnary is constant at 1 and both the halo region and first interior point change in time.

simone-silvestri commented 1 year ago

As an aside, I think this issue illustrates that users are indeed interested in being able to evaluate gradients across boundaries. This is important because @simone-silvestri proposed a change that would make this impossible (eg it has been proposed we do not fill halos for Value and Gradient boundary conditions, and instead evaluate the associated fluxes in the same way we do for immersed boundaries --- because this has performance advantages for very large models).

I agree, but this patch-up will not work for immersed boundaries anyway. I still advocate for (maybe not now but later down the line) a general line of thought that ensures consistency between immersed boundaries and "regular" boundaries (a la MITgcm) treating them always the same way. As an example, this issue could have been brought up for immersed boundaries, which would have required a (definitely more lengthy) rework of boundaries in abstract operations but would have solved the issue in both boundaries and immersed boundaries.

simone-silvestri commented 1 year ago

I am actually not even sure that it would be possible to do easily in this case

hdrake commented 1 year ago

I am actually not even sure that it would be possible to do easily in this case

Just to be clear, this is just for offline diagnostics, right? Online diagnostic operations on prognostic fields still correctly feel the boundary conditions?

simone-silvestri commented 1 year ago

Yeah, that's right. I was referring to eliminating the halo filling (except for periodic) and baking in Value and Gradient in the abstract operations, allowing boundary conditions on offline diagnostics also on immersed boundaries.

Flux does not work anyway because fill_halo assumes a zero flux boundary condition, so for offline diagnostics abstract operations on boundary regions are still quite brittle

simone-silvestri commented 1 year ago

Also, I am unsure what a Flux boundary condition entails regarding offline diagnostics.

glwagner commented 1 year ago

Flux does not work anyway because fill_halo assumes a zero flux boundary condition, so for offline diagnostics abstract operations on boundary regions are still quite brittle

Not sure what you mean. FluxBoundaryCondition has no implications for offline diagnostics. With FluxBoundaryCondition, values on / across boundaries are not defined. That's the easiest case --- it's not brittle at all!

glwagner commented 1 year ago

As an aside, I think this issue illustrates that users are indeed interested in being able to evaluate gradients across boundaries. This is important because @simone-silvestri proposed a change that would make this impossible (eg it has been proposed we do not fill halos for Value and Gradient boundary conditions, and instead evaluate the associated fluxes in the same way we do for immersed boundaries --- because this has performance advantages for very large models).

I agree, but this patch-up will not work for immersed boundaries anyway.

What do you mean? What patch-up?

I still advocate for (maybe not now but later down the line) a general line of thought that ensures consistency between immersed boundaries and "regular" boundaries (a la MITgcm) treating them always the same way.

I agree I think that would be nice. It means that operators need to know about boundary conditions though, which is a major refactor...

As an example, this issue could have been brought up for immersed boundaries, which would have required a (definitely more lengthy) rework of boundaries in abstract operations but would have solved the issue in both boundaries and immersed boundaries.

We support values and gradients on non-immersed boundaries, but we do not support evaluating them across immersed boundaries. We have to support what we claim / say that we support, that is the only issue.

glwagner commented 1 year ago

I think to support inserting Value or Gradient directly into abstract operations would essentially entail an independent implementation from the current AbstractOperation, because while this is certainly feasible on the CPU, I suspect we will run into limitations on the GPU fairly quickly. I think if people are interested in direct numerical simulation in complex domains that would benefit from that kind of thing then this is a worthwhile endeavor and could even be prototyped in an independent repository (magic of Julia).

Supporting correct boundary evaluation for non-immersed boundaries is straightforward via rules for filling halo regions. Thus despite the trade-offs, it makes sense to provide such a "bonus" feature: it's enabling for quite a few applications without a great cost (at least yet, because we don't have a user interface or great support for distributed computations). Support for operations across immersed boundaries is a more complex endeavor. Thus because I do not think we should regard Value / Gradient boundary conditions as a "core" feature (this package is oriented towards ocean modeling from large eddy simulation up to global scales --- direct numerical simulation is not our core application) the trade-off points towards not supporting this.

Especially due to finite resources for software development, many of our decisions are compromises. We don't aim to be perfect, we aim to be good.

hdrake commented 1 year ago

Thus because I do not think we should regard Value / Gradient boundary conditions as a "core" feature (this package is oriented towards ocean modeling from large eddy simulation up to global scales --- direct numerical simulation is not our core application) the trade-off points towards not supporting this.

Point taken, but I think there are still Oceananigans-relevant applications where Value / Gradient boundary conditions on non-immersed boundaries are useful enough to be a "core" feature (e.g. imposing observed SST patterns rather than observed air-sea heat fluxes), but there is always the workaround of strongly restoring boundary-adjacent sponge regions. I think this is what many ocean modelers do to implement such boundary conditions anyway.

I can't really conceive of any reasons why one would want Value / Gradient BCs on the immersed boundary though, and agree it is not alone worth a major refactor.

glwagner commented 1 year ago

Point taken, but I think there are still Oceananigans-relevant applications where Value / Gradient boundary conditions on non-immersed boundaries are useful enough to be a "core" feature (e.g. imposing observed SST patterns rather than observed air-sea heat fluxes), but there is always the workaround of strongly restoring boundary-adjacent sponge regions. I think this is what many ocean modelers do to implement such boundary conditions anyway.

True! I'm not aware that has ever been done, but since it's not difficult to support (notwithstanding @simone-silvestri's concerns about parallel performance) it's interesting to allow it --- agree.

For future readers I want to point out that SST restoring (and similar models) are typically be implemented as a FluxBoundaryCondition using a piston velocity model, rather than using ValueBoundaryCondition (which implies a flux mediated by some diffusivity / viscosity, possibly derived from a parameterization). (FluxBoundaryCondition is mathematically identical to restoring in the surface grid point, though it would be a slightly different model to distribute the restoring over some depth). It could be an interesting project to explore using some parameterization-derived diffusivity together with ValueBoundaryCondition to predict surface fluxes, though, I'm not sure what the implications would be.

CliMA / Oceananigans.jl

Offline differential operators do not correctly apply boundary conditions #3224

3225 seems to do the trick and the above example yields the plot