CliMA / Oceananigans.jl

šŸŒŠ Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
963 stars 190 forks source link

Reductions with `FieldTimeSeries` on an `ImmersedBoundaryGrid` are very slow #3750

Open ali-ramadhan opened 2 weeks ago

ali-ramadhan commented 2 weeks ago

This also makes data_summary and Base.show very slow since showing a FieldTimeSeries prints its min, mean, and max. So it's harder to work with FieldTimeSeries interactively. Seems fine when not on a ImmersedBoundaryGrid.

I'm guessing it's slower because it's masking out the immersed values but I don't know if we expect it to be ~2000x slower than without an immersed boundary. It's those memory allocations...

A quick quality-of-life fix could be to not call data_summary when showing a FieldTimeSeries.

MWE

using Oceananigans

arch = CPU()

L = 1
H = 1

underlying_grid = LatitudeLongitudeGrid(
    arch;
    topology = (Bounded, Bounded, Bounded),
    size = (512, 512, 64),
    latitude = (-L/2, L/2),
    longitude = (-L/2, L/2),
    z = (-H, 0),
    halo = (4, 4, 4)
)

h = L/2
w = L/5 
mount(x, y) = h * exp(-x^2 / 2w^2) * exp(-y^2 / 2w^2)
bottom(x, y) = -H + mount(x, y)

grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))

model = HydrostaticFreeSurfaceModel(; grid)

simulation = Simulation(model, Ī”t=1, stop_iteration=1)

simulation.output_writers[:fields] =
    JLD2OutputWriter(
        model,
        model.velocities;
        filename = "test.jld2",
        schedule = IterationInterval(1),
        overwrite_existing = true
    )

run!(simulation)

u = FieldTimeSeries("test.jld2", "u")
u2 = u[2]

Reduction over the FieldTimeSeries:

julia> @time minimum(u2)
 20.954897 seconds (118.72 M allocations: 130.792 GiB, 25.74% gc time)
0.0

Reduction over the underlying data:

julia> @time minimum(u2.data)
  0.011304 seconds (3 allocations: 1.562 KiB)
0.0

or almost 2000x faster.

Environment

Oceananigans.jl main branch with

Julia Version 1.10.5
Commit 6f3fdf7b362 (2024-08-27 14:19 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 48 Ɨ AMD Ryzen Threadripper 7960X 24-Cores
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 48 virtual cores)
ali-ramadhan commented 2 weeks ago

I thought this might be an issue with Fields themselves, but no:

using Oceananigans

arch = CPU()

L = 1
H = 1

underlying_grid = LatitudeLongitudeGrid(
    arch;
    topology = (Bounded, Bounded, Bounded),
    size = (512, 512, 64),
    latitude = (-L/2, L/2),
    longitude = (-L/2, L/2),
    z = (-H, 0),
    halo = (4, 4, 4)
)

h = L/2
w = L/5 
mount(x, y) = h * exp(-x^2 / 2w^2) * exp(-y^2 / 2w^2)
bottom(x, y) = -H + mount(x, y)

grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))

model = HydrostaticFreeSurfaceModel(; grid)

u = model.velocities.u

then

julia> @time minimum(u)
  0.063563 seconds (344 allocations: 31.789 KiB)
0.0

julia> @time minimum(u.data)
  0.013262 seconds (3 allocations: 1.391 KiB)
0.0

Only ~6x slower.

So maybe the FieldTimeSeries is creating the field from u[2] differently?

I haven't looked into the code much but wanted to at least open this issue.

simone-silvestri commented 2 weeks ago

Reductions on FieldTimeSeries are performed individually for each element by constructing two Fields and reducing one into another. Probably, the construction of the individual field is what is causing the loss in performance? We do not necessarily need to do that, we can just wrap the data in a ConditionalOperation.

glwagner commented 2 weeks ago

Just a thought that we probably want reductions on field time series to be performant anyways. So it's better that we call data summary because then more people will be annoyed that they are slow => more pressure to fix it šŸ˜†