Closed milankl closed 2 months ago
I've just @propagate_inbounds
like
Base.@propagate_inbounds Base.getindex(S::SpectralVariable4D, i::Integer, j::Integer, k::Integer, t::Integer) = S.data[k, t][i, j]
which I believe is correct because then all boundschecks are just done for the underlying vectors/matrices in there, or not at all with @inbounds
. But I haven't tested any of that yet manually, or included some CIs
The advantage I see when GridVariable3D
is a vector of vectors is that a lot of functionality is layer-based because it's a hydrostatic model. Then you can do vor[k]
or layer(vor, k)
(or whatever) and we'd get a vector for that layer back without allocations or the need to handle subarrays or views.
It also means that you can more easily reuse the internal functions and expose them to the user. If our spectral transforms are only defined between GridVariable3D
and SpectralVariable3D
we'd need to wrap and unwrap those. Maybe not that much of a problem, but still additional burden on the interface
This PR is also primarily thought for the upcoming GPU version, right?
Yes, but also generally to index variables simply as vor[ij,k]
so that we can write our functions based on some high-level indexing logic but then swap out the underlying data structure simply by redefining getindex
and setindex!
The advantage I see when
GridVariable3D
is a vector of vectors is that a lot of functionality is layer-based because it's a hydrostatic model. Then you can dovor[k]
orlayer(vor, k)
(or whatever) and we'd get a vector for that layer back without allocations or the need to handle subarrays or views.
I think avoiding allocations from indexing the N-dimensional isn't really that hard, if it's properly tested. I admit, that I am also sometimes a bit confused when Julia allocates something and when not, but we can definitely set it up in a way that that's not the case, and could also set up convenience functions like
layer(vor, k) = view(vor, :, k)
etc for that.
The more I think about it, the more I like the idea of just having N-dimensional Grid and LowerTriangularMatrix. It makes sure everything is contigous in memory, which (at least for GPU) I am almost certain will be an advantage for the performance, especially if we want to write kernels over the layer dimension as well. It would also make the code basis smaller and introduce less new types. You'd just have AbstractGrid
and LowerTriangularMatrix
which are new N-dimensional array types, and then PrognosticVariables
has only those types as fields.
So here I've defined
GridVariable3D
(widely used)GridVariable4D
(not used and I don't currently see a use case)SpectralVariable3D
(widely used)SpectralVariable4D
(4th dimension only of length 2 for timesteps)meaning that performance-related, we'd really only care about the 3D cases, the SpectralVariable4D
could just be a 2-element Vector{SpectralGrid3D}
which gets unpacked depending on the leapfrog time step.
We could write everything for the 3D structs using [ij, k]
(grids) and [l, m, k]
as well as [lm, k]
(spectral) for indexing, including the transforms. I believe that could be a general interface and then we can still play around with how exactly to setup the memory layout underneath. It's just a massive restructuring of all kernels which I'd like to do only once. Especially with all the places that rely on the current diagnostic_variables.layers[k].grid_variables.vor[ij]
syntax
It's just a massive restructuring of all kernels which I'd like to do only once.
Absolutely! That's also why I was directly asking whether this is primarily for the GPU version, because this is something that should work for it as well then.
We could write everything for the 3D structs using [ij, k] (grids) and [l, m, k] as well as [lm, k] (spectral) for indexing, including the transforms. [...] Especially with all the places that rely on the current diagnostic_variables.layers[k].grid_variables.vor[ij] syntax
I was thinking that exactly this is actually a major part of making the model GPU compatible, as the old structure isn't really ideally suited for it, and possibly should be done together with writing the actual GPU kernels. At least I think it would also save a lot of work to only do it once it that regard as well.
Yes, it should be reasonably future-proof restructuring for GPU and I believe we can still do the current multi-threading by writing the kernels as @floop for k in eachlayer(grid)
or similar, which moves the @floop
macros further towards to kernels, which is fine I believe.
I think we need to make a better informed decision by benchmarking. Once we just use the indexing as outlined above, it's just a matter of redefining getindex
, setindex!
. But to avoid any pitfalls we should test it. It's a bit of work to restructure the transforms, but maybe we can start with simpler 3D kernels to get a feel for how much difference it actually makes.
Yes, the beauty of Julia is that you can really manipulate all of those things and implement it in this very generic way.
I am just not sure if we (or you) really want to invest this time now, or not wait until we tackle the full GPU implementation. And for the full GPU implementation I'd still go bottom to top and start with the most low-level parts, because this allows us to actually test things on GPU as we go.
I am just not sure if we (or you) really want to invest this time now, or not wait until we tackle the full GPU implementation.
Sure, when do you want to start :wink:
Just had a call with @maximilian-gelbrecht we concluded the following
SpeedyWeather
should use [ij,k,t]
(grid) [l,m,k,t]
or [lm,k,t]
(spectral) indexing to have an abstract interface of a variable, e.g. prognostic_variables.vor
that would be independent of the underlying data structure.struct FullGaussianND{Eltype, Ndims, ArrayType} <: AbstractArray{Eltype,Ndims}
data::ArrayType{Eltype,Ndims}
nlat_half::Int # resolution parameter for grids
rings::ArrayType{UnitRange{Int},1} # precalculated vector for indexing rings, e.g. [1:20, 21:44,...]
end
# current FullGaussianGrid would be a subtype
FullGaussianGrid{T} = FullGaussianND{T, 1, Vector}
getindex
, setindex
and maybe some simple kernels as benchmarks.SpeedyTransforms
kernels till we know how we want it with respect to a GPU implementationNew LowerTriangularArray
being implemented at #503 which would replace SpectralVariable3D
/SpectralVariable4D
here. However, the changes to the PrognosticVariables
and DiagnosticVariables
should still be implemented just with those new array types!
I'd really like instead of this (currently)
humid = simulation.diagnostic_variables.layers[end].grid_variables.humid_grid
this to work
humid = simulation.diagnostic_variables.grid.humid[:,end]
returning surface humidity as ::AbstractGrid
superceded by #525
This introduces
GridVariable3D
,GridVariable4D
,SpectralVariable3D
,SpectralVariable4D
, four types that extend the horizontal grids and horizontalLowerTriangularMatrices
for the spherical harmonics in 1-2 dimensions. At the moment we still have a layer-oriented structure which puts the layer on the outside, i.e. we have dimensions currently structured like (x, y for grids, technically xy as they only have one index, or equivalently l, m for spectral)but I think we want for the future (an GPUs)
Now one can create a
GridVariable3D
(88 grid points, 5 layers) asa
GridVariable4D
similarly (88 grid points, 5 layers, 2 timesteps)a
SpectralVariable3D
similarly (3x3 LowerTriangularMatrix with 2 layers)or a
SpectralVariable4D
3x3 LowerTriangularMatrix with 2 layers and 1 timestep