CliMA / ClimaAtmos.jl

ClimaAtmos.jl is a library for building atmospheric circulation models that is designed from the outset to leverage data assimilation and machine learning tools. We welcome contributions!
Apache License 2.0
79 stars 14 forks source link

`extrema` gives different results on `Fields` vs parent arrays #2993

Open akshaysridhar opened 4 months ago

akshaysridhar commented 4 months ago

When running the box_density_current.yml example, and querying the extrema of a scalar field ( in this instance, density)

julia> simulation.integrator.u.c.ρ |> extrema
(0.0f0, 1.1547949f0)
julia> parent(simulation.integrator.u.c.ρ) |> extrema
(0.65239275f0, 1.1547949f0)

The latter is expected, so I suspect there is a bug in our minimum method for Fields

For reference typeof returns

julia> typeof(simulation.integrator.u.c.ρ)
ClimaCore.Fields.Field{ClimaCore.DataLayouts.VIJFH{Float32, 4, SubArray{Float32, 5, CUDA.CuArray{Float32, 5, CUDA.Mem.DeviceBuffer}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}}, ClimaCore.Spaces.ExtrudedFiniteDifferenceSpace{ClimaCore.Grids.ExtrudedFiniteDifferenceGrid{ClimaCore.Grids.SpectralElementGrid2D{ClimaCore.Topologies.Topology2D{ClimaComms.SingletonCommsContext{ClimaComms.CUDADevice}, ClimaCore.Meshes.RectilinearMesh{ClimaCore.Meshes.IntervalMesh{ClimaCore.Domains.IntervalDomain{ClimaCore.Geometry.XPoint{Float32}, Nothing}, LinRange{ClimaCore.Geometry.XPoint{Float32}, Int64}}, ClimaCore.Meshes.IntervalMesh{ClimaCore.Domains.IntervalDomain{ClimaCore.Geometry.YPoint{Float32}, Nothing}, LinRange{ClimaCore.Geometry.YPoint{Float32}, Int64}}}, Vector{CartesianIndex{2}}, Matrix{Int64}, CUDA.CuArray{Tuple{Int64, Int64, Int64, Int64, Bool}, 1, CUDA.Mem.DeviceBuffer}, Vector{Tuple{Int64, Int64, Int64, Int64, Bool}}, CUDA.CuArray{Tuple{Int64, Int64}, 1, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Tuple{Bool, Int64, Int64}, 1, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, @NamedTuple{}, CUDA.CuArray{Tuple{Int64, Int64}, 1, CUDA.Mem.DeviceBuffer}}, ClimaCore.Quadratures.GLL{4}, ClimaCore.Geometry.CartesianGlobalGeometry, ClimaCore.DataLayouts.IJFH{ClimaCore.Geometry.LocalGeometry{(1, 2), ClimaCore.Geometry.XYPoint{Float32}, Float32, StaticArraysCore.SMatrix{2, 2, Float32, 4}}, 4, CUDA.CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}}, ClimaCore.DataLayouts.IJFH{Float32, 4, CUDA.CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}}, ClimaCore.DataLayouts.IFH{ClimaCore.Geometry.SurfaceGeometry{Float32, ClimaCore.Geometry.UVVector{Float32}}, 4, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}}, @NamedTuple{}}, ClimaCore.Grids.FiniteDifferenceGrid{ClimaCore.Topologies.IntervalTopology{ClimaComms.SingletonCommsContext{ClimaComms.CUDADevice}, ClimaCore.Meshes.IntervalMesh{ClimaCore.Domains.IntervalDomain{ClimaCore.Geometry.ZPoint{Float32}, Tuple{Symbol, Symbol}}, LinRange{ClimaCore.Geometry.ZPoint{Float32}, Int64}}, @NamedTuple{bottom::Int64, top::Int64}}, ClimaCore.Geometry.CartesianGlobalGeometry, ClimaCore.DataLayouts.VF{ClimaCore.Geometry.LocalGeometry{(3,), ClimaCore.Geometry.ZPoint{Float32}, Float32, StaticArraysCore.SMatrix{1, 1, Float32, 1}}, CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, ClimaCore.Grids.Flat, ClimaCore.Geometry.CartesianGlobalGeometry, ClimaCore.DataLayouts.VIJFH{ClimaCore.Geometry.LocalGeometry{(1, 2, 3), ClimaCore.Geometry.XYZPoint{Float32}, Float32, StaticArraysCore.SMatrix{3, 3, Float32, 9}}, 4, CUDA.CuArray{Float32, 5, CUDA.Mem.DeviceBuffer}}}, ClimaCore.Grids.CellCenter}}

Doing the same to the parent array gives

julia> typeof(parent(simulation.integrator.u.c.ρ))
SubArray{Float32, 5, CUDA.CuArray{Float32, 5, CUDA.Mem.DeviceBuffer}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}
akshaysridhar commented 4 months ago

Following checks with @charleskawczynski this appears to be a custom GPU method implementation issue in our DataLayouts.

charleskawczynski commented 4 months ago

Hi @akshaysridhar, could you please add a reproducer, like:

# julia --project=examples
ENV["CLIMACOMMS_DEVICE"] = "CUDA";
empty!(ARGS)
push!(ARGS, "--config_file", "path/to/config")
using Revise; include(joinpath("examples", "hybrid", "driver.jl"))

?

charleskawczynski commented 2 months ago

bump! @akshaysridhar, was this with 1 gpu? or multiple? (do you happen to remember)

akshaysridhar commented 2 months ago

bump! @akshaysridhar, was this with 1 gpu? or multiple? (do you happen to remember)

This was a single GPU run (clima A100) @charleskawczynski