CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
966 stars 190 forks source link

Bug when adding the interior of fields with non-trivial indices in OutputWriters #3260

Open navidcy opened 1 year ago

navidcy commented 1 year ago

I stumbled on this trying to save the free surface for a HydrostaticFreeSurfaceModel. Here's an example:

using Oceananigans

model = HydrostaticFreeSurfaceModel(grid = RectilinearGrid(size = (5, 5, 4), x = (-1e3, 1e3), y = (-1e3, 1e3), z = (-1e3, 0)))

ηᵢ(x, y, z) = 1 * exp(-(x^2 + y^2) / 2 * (2e2)^2)

set!(model, η = ηᵢ)

simulation = Simulation(model, Δt=100, stop_time = 1000)

simulation.output_writers[:surface] = JLD2OutputWriter(model, (η=model.free_surface.η,),
                                                       schedule = TimeInterval(200),
                                                       filename = "surface",
                                                       overwrite_existing = true)
ERROR: BoundsError: attempt to access 11×11×1 Array{Float64, 3} at index [4:8, 4:8, 4:8]
Stacktrace:
  [1] throw_boundserror(A::Array{Float64, 3}, I::Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}})
    @ Base ./abstractarray.jl:744
  [2] checkbounds
    @ ./abstractarray.jl:709 [inlined]
  [3] view
    @ ./subarray.jl:177 [inlined]
  [4] offset_windowed_data(data::OffsetArrays.OffsetArray{Float64, 3, Array{Float64, 3}}, Loc::Tuple{DataType, DataType, DataType}, grid::RectilinearGrid{Float64, Periodic, Periodic, Bounded, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, ...
    @ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:245
  [5] view(f::Field{Center, Center, Face, Nothing, RectilinearGrid{Float64, Periodic, Periodic, Bounded, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, OffsetArrays.OffsetVector{Float64, ...
    @ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:312
  [6] #Field#13
    @ ~/Oceananigans.jl/src/Fields/field.jl:179 [inlined]
  [7] Field
    @ ~/Oceananigans.jl/src/Fields/field.jl:179 [inlined]
  [8] construct_output(user_output::Field{Center, Center, Face, Nothing, RectilinearGrid{Float64, Periodic, Periodic, Bounded, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, ...
    @ Oceananigans.OutputWriters ~/Oceananigans.jl/src/OutputWriters/output_construction.jl:46
  [9] (::Oceananigans.OutputWriters.var"#28#29"{Tuple{Colon, Colon, Colon}, Bool, HydrostaticFreeSurfaceModel{Oceananigans.TimeSteppers.QuasiAdamsBashforth2TimeStepper{Float64, NamedTuple{(:u, :v, :η, :T, :S), Tuple{Field{Face, Center, Center, Nothing, ...
    @ Oceananigans.OutputWriters ./none:0
 [10] iterate
    @ ./generator.jl:47 [inlined]
 [11] merge(a::NamedTuple{(), Tuple{}}, itr::Base.Generator{Tuple{Symbol}, Oceananigans.OutputWriters.var"#28#29"{Tuple{Colon, Colon, Colon}, Bool, HydrostaticFreeSurfaceModel{Oceananigans.TimeSteppers....
    @ Base ./namedtuple.jl:303
 [12] NamedTuple(itr::Base.Generator{Tuple{Symbol}, Oceananigans.OutputWriters.var"#28#29"{Tuple{Colon, Colon, Colon}, Bool, HydrostaticFreeSurfaceModel{Oceananigans.TimeSteppers.QuasiAdamsBashforth2TimeStepper{Float64, ...
    @ Base ./namedtuple.jl:123
 [13] JLD2OutputWriter(model::HydrostaticFreeSurfaceModel{Oceananigans.TimeSteppers.QuasiAdamsBashforth2TimeStepper{Float64, NamedTuple{(:u, :v, :η, :T, :S), Tuple{Field{Face, Center, Center, Nothing, RectilinearGrid{Float64, Periodic, Periodic, Bounded, Float64, Float64, ....
    @ Oceananigans.OutputWriters ~/Oceananigans.jl/src/OutputWriters/jld2_output_writer.jl:179
 [14] top-level scope
    @ REPL[10]:1

However, if I include halos in the output everything seems OK...

simulation.output_writers[:surface] = JLD2OutputWriter(model, (η=model.free_surface.η,),
                                                              schedule = TimeInterval(200),
                                                              with_halos = true,
                                                              filename = "surface",
                                                              overwrite_existing = true)
JLD2OutputWriter scheduled on TimeInterval(3.333 minutes):
├── filepath: ./surface.jld2
├── 1 outputs: η
├── array type: Array{Float64}
├── including: [:grid, :coriolis, :buoyancy, :closure]
└── max filesize: Inf YiB
navidcy commented 6 months ago

cc @josuemtzmo

navidcy commented 6 months ago

@josuemtzmo is running on the same issue. Is there a quick fix to this so they can at least use it temporarily to output free surface elevation?

josuemtzmo commented 6 months ago

Thanks @navidcy!

navidcy commented 6 months ago

I actually saw the linked PR and from its title I inferred that if you give with_halos = true to the output writer then it should work; could you try this out? Then the horizontal halos of the free surface will be saved so you might have to take care when post-processing...

josuemtzmo commented 6 months ago

When using with_halos = true in the JLD2OutputWriter works well, however in NetCDFOutputWriter it crashes with the following error:

[ Info: Initializing simulation...
Iteration: 0000, time: 0 seconds, Δt: 11 seconds, wall time: 0 seconds
ERROR: LoadError: DimensionMismatch: new dimensions (156, 156, 22, 1) must be consistent with array size 24336
Stacktrace:
  [1] (::Base.var"#throw_dmrsa#328")(dims::NTuple{4, Int64}, len::Int64)
    @ Base ./reshapedarray.jl:41
  [2] reshape(a::Array{Float64, 3}, dims::NTuple{4, Int64})
    @ Base ./reshapedarray.jl:45
  [3] setindex_disk!(::NCDatasets.Variable{Float64, 4, NCDataset{Nothing}}, ::Array{Float64, 3}, ::Function, ::Vararg{Any})
    @ DiskArrays /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/DiskArrays/bZBJE/src/diskarray.jl:56
  [4] setindex!
    @ /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/DiskArrays/bZBJE/src/diskarray.jl:229 [inlined]
  [5] setindex!(::CommonDataModel.CFVariable{…}, ::Array{…}, ::Colon, ::Colon, ::Colon, ::UnitRange{…})
    @ CommonDataModel /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/CommonDataModel/pO4st/src/cfvariable.jl:419
  [6] save_output!(ds::NCDataset{…}, output::Field{…}, model::HydrostaticFreeSurfaceModel{…}, ow::NetCDFOutputWriter{…}, time_index::Int64, name::String)
    @ Oceananigans.OutputWriters /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/Oceananigans/17XSY/src/OutputWriters/netcdf_output_writer.jl:479
  [7] write_output!(ow::NetCDFOutputWriter{…}, model::HydrostaticFreeSurfaceModel{…})
    @ Oceananigans.OutputWriters /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/Oceananigans/17XSY/src/OutputWriters/netcdf_output_writer.jl:518
  [8] initialize!(sim::Simulation{…})
    @ Oceananigans.Simulations /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/Oceananigans/17XSY/src/Simulations/run.jl:212
  [9] time_step!(sim::Simulation{…})
    @ Oceananigans.Simulations /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/Oceananigans/17XSY/src/Simulations/run.jl:112
 [10] run!(sim::Simulation{…}; pickup::Bool)
    @ Oceananigans.Simulations /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/Oceananigans/17XSY/src/Simulations/run.jl:97
 [11] run!(sim::Simulation{…})
    @ Oceananigans.Simulations /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/Oceananigans/17XSY/src/Simulations/run.jl:85
 [12] top-level scope
    @ ~/github/single_eddy/scripts/run_model_hydro.jl:167
 [13] include(fname::String)
    @ Base.MainInclude ./client.jl:489
 [14] top-level scope
    @ REPL[4]:1
 [15] top-level scope
    @ /home/datawork-lops-drakkarcom/SIMULATION-OUTPUTS/ICE-CHANEL/.julia/packages/CUDA/rXson/src/initialization.jl:208
in expression starting at /home1/datahome/jmartine/github/single_eddy/scripts/run_model_hydro.jl:167
Some type information was truncated. Use `show(err)` to see complete types.
jagoosw commented 6 months ago

I think this is the same bug as https://github.com/CliMA/Oceananigans.jl/issues/2770

josuemtzmo commented 6 months ago

I'm a bit confused, I managed to solve the issue by changing manually the grid passed to output_indices at: https://github.com/CliMA/Oceananigans.jl/blob/e243e5b44ea0c4570ae427bdce1d68dc29594f24/src/OutputWriters/output_construction.jl#L32-L42 For example, in the example provided by @navidcy:

julia> size(grid)
(5, 5, 4)

That is passed to the output_indices function that outputs the indices (1:5, 1:5, 1:5) or (4:8, 4:8, 4:8) once the halos are taken into account. The way I managed to save $\eta$ in a hacky way was by changing the size(grid) to (5, 5, -1), so it outputs the surface index in z, since the size(eta) is (5, 5, 1) My two ideas on how to fixing this are:

Before I do any of this changes, I don't understand how are the halos (applied in all directions) implemented in a surface such as $\eta$? I guess the same will apply to any flux at the surface right?

Do you have any other way to fix this?

glwagner commented 6 months ago

There is definitely a wrong assumption somewhere in this pipeline.

Maybe we can focus on this:

since it try to access the indexes (4:8, 4:8, 4:5) rather than (4:8, 4:8, 1:0)

First of all I think this means that we are using with_halo=true? Just want to confirm that.

Second there is a typo right @josuemtzmo ? You meant to say that the indices should be (4:8, 4:8, 1:1)?

This is maybe where the incorrect assumption is. If we need (4:8, 4:8, 1:1), then we are looking for the indices of the underlying view --- but not the indices of the Field. Because, the indices of eta are definitely (4:8, 4:8, 4:5). That's the whole point of the windowed fields abstraction is to be able to properly locate a field in the 3D index space. eta is the free surface, so it's indices are at the top of the domain.

@josuemtzmo can you show the whole stack trace of your error so we can see where the indexing issue comes in?

josuemtzmo commented 6 months ago

There is definitely a wrong assumption somewhere in this pipeline.

Maybe we can focus on this:

since it try to access the indexes (4:8, 4:8, 4:5) rather than (4:8, 4:8, 1:0)

I had a typo here, the indexes that it tries to access are (4:8, 4:8, 4:8) as if the field was 3D (See the error below).

First of all I think this means that we are using with_halo=true? Just want to confirm that.

Nope, that is without the halo (with_halo=False), when using with_halo=true the indexes passed are (Colon(), Colon(), Colon()) so there is no issue accessing the indexes of the output.

Second there is a typo right @josuemtzmo ? You meant to say that the indices should be (4:8, 4:8, 1:1)?

When I print the indexes I get (4:8, 4:8, 1:0) but I agree that there is something strange, since I also expected what you said (4:8, 4:8, 1:1).

This is maybe where the incorrect assumption is. If we need (4:8, 4:8, 1:1), then we are looking for the indices of the underlying view --- but not the indices of the Field. Because, the indices of eta are definitely (4:8, 4:8, 4:5). That's the whole point of the windowed fields abstraction is to be able to properly locate a field in the 3D index space. eta is the free surface, so it's indices are at the top of the domain.

That's likely the case, since it seems that we are accessing the indexes of the view.

For example, doing eta.indices I get (Colon(), Colon(), 5:5). However, the only way I manage to make work Field(eta, indices = indices) is using indices = (4:8,4:8,-2). I think the reason it only works by pass a -2 (z index) results from the fact that the view uses -2+3, where 3 is the size of the halo. I've confirmed that changing the halo in the RectilinearGrid, changes the value of the (z index) to pass to access the Field.

julia> Field(eta, indices = (1:5,1:5,-2))
5×5×1 Field{Center, Center, Face} on RectilinearGrid on CPU
├── grid: 5×5×4 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on CPU with 3×3×3 halo
├── boundary conditions: FieldBoundaryConditions
│   └── west: Nothing, east: Nothing, south: Nothing, north: Nothing, bottom: Nothing, top: Nothing, immersed: ZeroFlux
├── indices: (1:5, 1:5, -2:-2)
└── data: 5×5×1 OffsetArray(view(::Array{Float64, 3}, 4:8, 4:8, 1:1), 1:5, 1:5, -2:-2) with eltype Float64 with indices 1:5×1:5×-2:-2
    └── max=1.0, min=0.0, mean=0.04

Meanwhile if I set the indices to (1:5,1:5,2) to access the supposedly eta.indices, I get:

Field(eta, indices = (1:5,1:5,2))
ERROR: BoundsError: attempt to access 11×11×1 Array{Float64, 3} at index [4:8, 4:8, 5:5]
Stacktrace:
 [1] throw_boundserror(A::Array{Float64, 3}, I::Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}})
   @ Base ./abstractarray.jl:737
 [2] checkbounds
   @ ./abstractarray.jl:702 [inlined]
 [3] view
   @ ./subarray.jl:184 [inlined]
 [4] offset_windowed_data(data::OffsetArrays.OffsetArray{…}, Loc::Tuple{…}, grid::RectilinearGrid{…}, indices::Tuple{…})
   @ Oceananigans.Fields ~/github/Oceananigans.jl/src/Fields/field.jl:248
 [5] view(f::Field{Center, Center, Face, Nothing, RectilinearGrid{…}, Tuple{…}, OffsetArrays.OffsetArray{…}, Float64, FieldBoundaryConditions{…}, Nothing, Oceananigans.Fields.FieldBoundaryBuffers{…}}, i::UnitRange{Int64}, j::UnitRange{Int64}, k::Int64)
   @ Oceananigans.Fields ~/github/Oceananigans.jl/src/Fields/field.jl:316
 [6] #Field#15
   @ ~/github/Oceananigans.jl/src/Fields/field.jl:182 [inlined]
 [7] top-level scope
   @ REPL[39]:1
Some type information was truncated. Use `show(err)` to see complete types.

@josuemtzmo can you show the whole stack trace of your error so we can see where the indexing issue comes in? Yes, here it is:

ERROR: LoadError: BoundsError: attempt to access 11×11×1 Array{Float64, 3} at index [4:8, 4:8, 4:8]
Stacktrace:
[1] throw_boundserror(A::Array{Float64, 3}, I::Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}})
@ Base ./abstractarray.jl:737
[2] checkbounds
@ ./abstractarray.jl:702 [inlined]
[3] view
@ ./subarray.jl:184 [inlined]
[4] offset_windowed_data(data::OffsetArrays.OffsetArray{…}, Loc::Tuple{…}, grid::RectilinearGrid{…}, indices::Tuple{…})
@ Oceananigans.Fields ~/github/Oceananigans.jl/src/Fields/field.jl:248
[5] view(f::Field{Center, Center, Face, Nothing, RectilinearGrid{…}, Tuple{…}, OffsetArrays.OffsetArray{…}, Float64, FieldBoundaryConditions{…}, Nothing, Oceananigans.Fields.FieldBoundaryBuffers{…}}, i::UnitRange{Int64}, j::UnitRange{Int64}, k::UnitRange{Int64})
@ Oceananigans.Fields ~/github/Oceananigans.jl/src/Fields/field.jl:316
[6] Field
@ ~/github/Oceananigans.jl/src/Fields/field.jl:182 [inlined]
[7] construct_output(user_output::Field{…}, grid::RectilinearGrid{…}, user_indices::Tuple{…}, with_halos::Bool)
@ Oceananigans.OutputWriters ~/github/Oceananigans.jl/src/OutputWriters/output_construction.jl:50
[8] (::Oceananigans.OutputWriters.var"#28#29"{Tuple{…}, Bool, HydrostaticFreeSurfaceModel{…}})(name::Symbol)
@ Oceananigans.OutputWriters ./none:0
[9] iterate
@ ./generator.jl:47 [inlined]
[10] merge(a::@NamedTuple{}, itr::Base.Generator{Tuple{Symbol}, Oceananigans.OutputWriters.var"#28#29"{Tuple{…}, Bool, HydrostaticFreeSurfaceModel{…}}})
@ Base ./namedtuple.jl:364
[11] NamedTuple(itr::Base.Generator{Tuple{Symbol}, Oceananigans.OutputWriters.var"#28#29"{Tuple{…}, Bool, HydrostaticFreeSurfaceModel{…}}})
@ Base ./namedtuple.jl:151
[12] JLD2OutputWriter(model::HydrostaticFreeSurfaceModel{…}, outputs::@NamedTuple{…}; filename::String, schedule::TimeInterval, dir::String, indices::Tuple{…}, with_halos::Bool, array_type::Type, file_splitting::Oceananigans.OutputWriters.NoFileSplitting, overwrite_existing::Bool, init::typeof(Oceananigans.OutputWriters.noinit), including::Vector{…}, verbose::Bool, part::Int64, jld2_kw::Dict{…})
@ Oceananigans.OutputWriters ~/github/Oceananigans.jl/src/OutputWriters/jld2_output_writer.jl:190
[13] top-level scope
@ ~/github/Oceananigans.jl/test.jl:19
[14] include(fname::String)
@ Base.MainInclude ./client.jl:489
[15] top-level scope
@ REPL[1]:1
in expression starting at /Users/jmtzmo/github/Oceananigans.jl/test.jl:19
Some type information was truncated. Use `show(err)` to see complete types.
josuemtzmo commented 6 months ago

As far as I can tell, I think this is related to the issue @jagoosw mentioned too, where the indices that the output is trying to access does not match the size of the sliced fields, similar to what we are seeing here.

josuemtzmo commented 6 months ago

A simple fix to manage to output $\eta$ will be to pass indices=(:,:,1-grid.Hz) in the JLD2OutputWriter. i.e.:

simulation.output_writers[:surface] = JLD2OutputWriter(model, 
                                                       (η=eta,),
                                                       indices=(:,:,,1-grid.Hz),
                                                       schedule = TimeInterval(200),
                                                       filename = "surface",
                                                       with_halos = false,
                                                       overwrite_existing = true)
glwagner commented 6 months ago

Nope, that is without the halo (with_halo=False), when using with_halo=true the indexes passed are (Colon(), Colon(), Colon()) so there is no issue accessing the indexes of the output.

Sorry, I meant with_halo=false

glwagner commented 6 months ago

A simple fix to manage to output η will be to pass indices=(:,:,1-grid.Hz) in the JLD2OutputWriter. i.e.:

simulation.output_writers[:surface] = JLD2OutputWriter(model, 
                                                       (η=eta,),
                                                       indices=(:,:,,1-grid.Hz),
                                                       schedule = TimeInterval(200),
                                                       filename = "surface",
                                                       with_halos = false,
                                                       overwrite_existing = true)

Good to know but definitely we want to fix the underlying issue