Help wanted on creating something similar to `StrideArray`

YichengDWu commented 1 year ago

Thank you for creating this package, and I hope to receive some guidance from you. Recently, I have developed CuTe.jl, which abstracts the shape and stride(s) of an array into a structure called Layout. This abstraction allows us to perform various algebraic operations on the Layout. My goal is to implement something akin to StrideArray, with the distinction that the shape and stride are already parameterized by the Layout, and thus, the data only needs to be a one-dimensional vector. Moreover, the length of the vector should also be part of the type parameters. Do you have any suggestions or advice?

Also, you are welcome the have a glance at the source code of CuTe.jl for any apparent mistakes.

chriselrod commented 1 year ago

It sounds like you've done most of the work -- what's left?

It's trivial to go from strides to indices, and not much harder to go from size.

If you didn't have all the information as part of the type, I'd encourage balancing sparseness of runtime representation with overspecialization. But that's a non issue when all info is known at compile time.

You can do

struct Layout{ActuallyLotsOfParams,Length} end

# or maybe <: DenseArray{T,N} instead
abstract type AbstractLayoutArray{T,N,L<:Layout} <: AbstractArray{T,N} end

# view that doesn't own memory
struct PArray{T,N,L} <: AbstractLayoutArray{T,N,L}
    p::Ptr{T}
    l::L # sizeof(L) == 0, so doesn't really matter; could just as easily `L()`
end
# I'm writing `l` as a field though in case you wanted to support some dynamic information like sizes or strides in the future

# view that owns memory
# operations on `OwningArray` that interact with the pointer `p` are going to
# have to `GC.@preserve` the field `m` to protect it.
struct OwningPArray{T,N,L,M} <: AbstractLayoutArray{T,N,L}
    p::Ptr{T}
    l::L
    m::M
end
# Or, just have it wrap the PArray
struct OwningPArrayv2{T,N,L,M} <: AbstractLayoutArray{T,N,L}
    p::PArray{T,N,L}
    m::M
end

# no `l::L` field here, as we aren't going to allow this to be dynamically sized.
# for dynamic sizing, use the `OwningPArray` with something like `M===Vector{T}` or `M===Vector{UInt8}`. 
# This array also owns memory.
# The advantage of the mutable struct here is we have 1-less pointer indirection, making it more efficient.
# use `pointer_from_objref` to get a pointer.
mutable struct LArray{T,N,Len,L} <: AbstractLayoutArray{T,N,L}
    data::NTuple{Len,T}
end

Then define all your usual getindex/setindex!, etc making use of your layouts. A view of an LAarray or OwningPArray would return another OwningPArray, while a non-owning PArray would return another PArray. You can GC.@preserve either owning representation, and then use a non-owning PArray to pass through non-inlined function boundaries. Both the owning and non-owning PArray variants are convenient if you want to manually manage memory, e.g. carve up a big block and doll out pieces of it. This can be more convenient than managing reuse caches, like some packages do.

As an aside, I prefer having slicing with getindex return views. Someone can copy a view if they want a copy.

YichengDWu commented 1 year ago

Thank you for your reply. Please allow me to provide more information before I carefully read your code. Ultimately, I want to create a CuTeArray that looks like this:

struct CuTeArray{T, N, A<:Engine{T}, L<:Layout{N}} <: AbstractArray{T, N}
    engine::E
    layout::L
end

Here, the engine represents a one-dimensional vector for the underlying data. I would like to create two types of engines, one owning and one non-owning. Below is some draft code:

abstract type Engine{T} <: DenseVector{T} end

struct ViewEngine{T, P<:Ref{T}} <: Engine{T} # non-owning
    ptr::P
    len::Int
end

mutable struct ArrayEngine{T, L} <: Engine{T}  # owning, must have a static size
    data::NTuple{L, T}
end

I am not sure how to proceed. Upon generally reading the source code of StrideArrayCore, it seems I should write the following code:

@inline Base.unsafe_convert(::Type{Ptr{T}}, A::ArrayEngine{T}) where {T} = Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
@inline Base.pointer(A::ArrayEngine{T}) where {T} = Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))

chriselrod commented 1 year ago

Yes. You can implement getindex and setindex! with pointer and GC.@preserve.

The non-owning of course does not need GC.@preserve, as it wouldn't do anything anyway. There should be a GC.@preserve somewhere protecting the backed memory (unless the GC isn't tracking the memory, e.g. you have a Libc.malloced pointer).

YichengDWu commented 1 year ago

This is my first attempt, and I'm not sure if I made any mistakes or did something unnecessary.

abstract type Engine{T} <: DenseVector{T} end

@inline Base.IndexStyle(::Type{<:Engine}) = IndexLinear()
@inline Base.elsize(::Engine{T}) where {T} = sizeof(T)

struct ViewEngine{T, P} <: Engine{T} # non-owning
    ptr::P
    len::Int
end

@inline function ViewEngine(ptr::Ptr{T}, len::Int) where {T}
    return ViewEngine{T, typeof(ptr)}(ptr, len)
end

@inline function ViewEngine(A::AbstractArray)
    p = LayoutPointers.memory_reference(A)[1] # not sure what this does
    return ViewEngine(p, length(A))
end

@inline Base.pointer(A::ViewEngine) = getfield(A, :ptr)
@inline function Base.unsafe_convert(p::Type{<:Ref{T}}, A::ViewEngine{T}) where {T}
    return Base.unsafe_convert(p, pointer(A))
end

@inline Base.size(A::ViewEngine) = tuple(getfield(A, :len))
@inline Base.length(A::ViewEngine) = getfield(A, :len)

@inline function Base.getindex(A::ViewEngine{T}, i::Integer) where {T}
    @boundscheck checkbounds(A, i)
    return unsafe_load(pointer(A), i)
end

@inline function Base.setindex!(A::ViewEngine{T}, val, i::Integer) where {T}
    @boundscheck checkbounds(A, i)
    unsafe_store!(pointer(A), val, i)
    return val
end

@inline ManualMemory.preserve_buffer(::ViewEngine) = nothing

mutable struct ArrayEngine{T, L} <: DenseVector{T}  # owning
    data::NTuple{L, T}
    @inline ArrayEngine{T, L}(::UndefInitializer) where {T, L} = new{T, L}()
    @inline function ArrayEngine{T}(::UndefInitializer, ::StaticInt{L}) where {T, L}
        return ArrayEngine{T, L}(undef)
    end
end

@inline function Base.unsafe_convert(::Type{Ptr{T}}, A::ArrayEngine{T}) where {T}
    return Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
end
@inline function Base.pointer(A::ArrayEngine{T}) where {T}
    return Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
end

@inline Base.size(::ArrayEngine{T, L}) where {T, L} = (L,)
@inline Base.length(::ArrayEngine{T, L}) where {T, L} = L
@inline Base.similar(::ArrayEngine{T, L}) where {T, L} = ArrayEngine{T, L}(undef)
@inline function Base.similar(A::ArrayEngine, ::Type{T}) where {T}
    return ArrayEngine{T}(undef, static(length(A)))
end

@inline function ArrayEngine{T}(f::Function, ::StaticInt{L}) where {T, L}
    A = ArrayEngine{T, L}(undef)
    @inbounds for i in eachindex(A)
        A[i] = f(i)
    end
    return A
end

@inline function ManualMemory.preserve_buffer(A::ArrayEngine)
    return ManualMemory.preserve_buffer(getfield(A, :data))
end

Base.@propagate_inbounds function Base.getindex(A::ArrayEngine,
                                                i::Union{Integer, StaticInt})
    b = ManualMemory.preserve_buffer(A)
    GC.@preserve b begin ViewEngine(A)[i] end
end

Base.@propagate_inbounds function Base.setindex!(A::ArrayEngine, val,
                                                 i::Union{Integer, StaticInt})
    b = ManualMemory.preserve_buffer(A)
    GC.@preserve b begin ViewEngine(A)[i] = val end
end

YichengDWu commented 1 year ago

I'm not quite sure how to properly implement the pointer for the CuTeArray that packs engine. What I'm doing now is making it point to its engine:

struct CuTeArray{T, N, E <: DenseVector{T}, L <: Layout{N}} <: AbstractArray{T, N}
    engine::E
    layout::L
    function CuTeArray(engine::DenseVector{T}, layout::Layout{N}) where {T, N}
        return new{T, N, typeof(engine), typeof(layout)}(engine, layout)
    end
end
@inline function Base.unsafe_convert(::Type{Ptr{T}}, A::CuTeArray{T,N,<:ArrayEngine}) where {T,N}
    return Base.unsafe_convert(Ptr{T}, pointer_from_objref(engine(A)))
end
@inline function Base.pointer(A::CuTeArray{T,N,<:ArrayEngine}) where {N,T}
    return Base.unsafe_convert(Ptr{T}, pointer_from_objref(engine(A)))
end

Is this even correct?

YichengDWu commented 1 year ago

Hi I think that I have got the hang of it so I'm closing this

JuliaSIMD / StrideArrays.jl

Help wanted on creating something similar to `StrideArray` #73