omlins / ParallelStencil.jl

Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
BSD 3-Clause "New" or "Revised" License
301 stars 31 forks source link

Make CellArrays mutable #110

Open TakeTwiceDailey opened 11 months ago

TakeTwiceDailey commented 11 months ago

I found the documentation on the @CellType macro to be quite interesting and potentially useful for me.

For example, if I am trying to solve a system of tensor equations, then having this ability to store the elements of a symmetric tensor on the grid like the given example @CellType SymmetricTensor2D fieldnames=(xx, zz, xz) would be useful.

But I found that (as far as I can tell from the docs and my testing) there is no way to make the cell type mutable, which is necessary when the tensor makes up the state vector of my system of equations, and thus must be modified at each time step. Since this ultimately comes from StaticArrays.jl, there are mutable types available in there, so this functionality could be added.

Also, it would be nice to have the CellType act as the tensor it is meant to represent, while still only storing the needed values. For example, defining @CellType SymmetricTensor2D fieldnames=(xx, zz, xz) I could then overload getindex

@inline function Base.getindex(A::SymmetricTensor2D,μ::Int64,ν::Int64)
    (μ,ν) == (1,1) && return A.xx
    (μ,ν) == (2,2) && return A.zz
    (μ,ν) == (1,2) && return A.xz
    (μ,ν) == (2,1) && return A.xz
end

I don't know if this is the best way to go about this, although it does work. It might be nice to be able to define an index mapping in @CellType so that general tensors can index however they are needed to.

omlins commented 10 months ago

@TakeTwiceDailey :

first of all, to be sure that we are on the same page, note that @CellType and the celldims key word of the allocator functions rely on the package CellArrays which defines the type CellArray.

CellArray creates the StaticArray cells on the fly when accessed. Mutable StaticArrays cannot be created in GPU kernels as they would require heap allocation (besides, performance would certainly be terrible in this use case). Thus, only immutable types are used. Nevertheless, CellArrays' content is mutable. The following example shows three ways of content mutation (the example does not use ParallelStencil, but the same applies also in this case):

using CellArrays, StaticArrays

function main_CellArrays()
    nel      = 2
    celldims = (3, 3)
    Cell     = SMatrix{celldims..., Float64, prod(celldims)}
    C        = CPUCellArray{Cell}(undef, nel) 
    C.data  .= 0.0

    # The aims is to change te values of each 3x3 matrix
    # 1 - with a double loop for each element: it works
    for e=1:nel
        for i=1:celldims[1]
            for j=1:celldims[2]
                field(C,i,j)[e] = 1.0
            end
        end
    end
    @show C

    # 2 - assigning cell by cell
    for e=1:nel
        C[e] = Cell(ones(celldims))
    end
    @show C

    # 3 - assiging all data at once
    C.data .= 9.0
    @show C
end
main_CellArrays()

Check out the documentation of CellArrays for more information.

Also, it would be nice to have the CellType act as the tensor it is meant to represent, while still only storing the needed values. For example, defining @CellType SymmetricTensor2D fieldnames=(xx, zz, xz) I could then overload getindex (...)

I think i think it should work to create an alternative indexing on top of the indexing created by @CellType. If you can work something out that is robust and performance, then we would be happy for a pull request in CellArrays.jl for this additional functionality on top of the existing functionality.

TakeTwiceDailey commented 10 months ago

@omlins

Thanks, I understand much better how CellArrays is meant to work now.

I find too often I try to reinvent the wheel. When in doubt, someone smarter in Julia than I has thought of these things.

For my purposes, storing the state vector of my system of equations in a StructArray from StructArrays.jl with elements of type Tensor from Tensorial.jl has all of the functionality I want, and the storage and access efficiency has been optimized behind the scenes.

omlins commented 10 months ago

@TakeTwiceDailey: StructArray from StructArrays.jl performs very badly on GPU (and I also ran into some errors on GPU). This is actually the very reason why I created the package CellArrays (I had first experimented with StructArrays and did some performance tests on GPU). If you only want to run on CPU, then both performance and functionality of StructArray should be good.

TakeTwiceDailey commented 10 months ago

@omlins Oh, I see.

I would certainly like to run on a GPU at some point, but I have many problems to surmount before what I want to do is working on the CPU.

If I were to run this on the GPU, we would not only need to make CellArrays act like tensors, we would need to make @einsum work properly on them as well. I'm afraid I am not a software developer in any sense, so a lot of this machinery is way over my head.