Closed YichengDWu closed 1 year ago
It sounds like you've done most of the work -- what's left?
It's trivial to go from strides to indices, and not much harder to go from size.
If you didn't have all the information as part of the type, I'd encourage balancing sparseness of runtime representation with overspecialization. But that's a non issue when all info is known at compile time.
You can do
struct Layout{ActuallyLotsOfParams,Length} end
# or maybe <: DenseArray{T,N} instead
abstract type AbstractLayoutArray{T,N,L<:Layout} <: AbstractArray{T,N} end
# view that doesn't own memory
struct PArray{T,N,L} <: AbstractLayoutArray{T,N,L}
p::Ptr{T}
l::L # sizeof(L) == 0, so doesn't really matter; could just as easily `L()`
end
# I'm writing `l` as a field though in case you wanted to support some dynamic information like sizes or strides in the future
# view that owns memory
# operations on `OwningArray` that interact with the pointer `p` are going to
# have to `GC.@preserve` the field `m` to protect it.
struct OwningPArray{T,N,L,M} <: AbstractLayoutArray{T,N,L}
p::Ptr{T}
l::L
m::M
end
# Or, just have it wrap the PArray
struct OwningPArrayv2{T,N,L,M} <: AbstractLayoutArray{T,N,L}
p::PArray{T,N,L}
m::M
end
# no `l::L` field here, as we aren't going to allow this to be dynamically sized.
# for dynamic sizing, use the `OwningPArray` with something like `M===Vector{T}` or `M===Vector{UInt8}`.
# This array also owns memory.
# The advantage of the mutable struct here is we have 1-less pointer indirection, making it more efficient.
# use `pointer_from_objref` to get a pointer.
mutable struct LArray{T,N,Len,L} <: AbstractLayoutArray{T,N,L}
data::NTuple{Len,T}
end
Then define all your usual getindex
/setindex!
, etc making use of your layouts.
A view
of an LAarray
or OwningPArray
would return another OwningPArray
, while a non-owning PArray
would return another PArray
.
You can GC.@preserve
either owning representation, and then use a non-owning PArray
to pass through non-inlined function boundaries.
Both the owning and non-owning PArray
variants are convenient if you want to manually manage memory, e.g. carve up a big block and doll out pieces of it. This can be more convenient than managing reuse caches, like some packages do.
As an aside, I prefer having slicing with getindex
return view
s. Someone can copy
a view if they want a copy.
Thank you for your reply. Please allow me to provide more information before I carefully read your code. Ultimately, I want to create a CuTeArray that looks like this:
struct CuTeArray{T, N, A<:Engine{T}, L<:Layout{N}} <: AbstractArray{T, N}
engine::E
layout::L
end
Here, the engine represents a one-dimensional vector for the underlying data. I would like to create two types of engines, one owning and one non-owning. Below is some draft code:
abstract type Engine{T} <: DenseVector{T} end
struct ViewEngine{T, P<:Ref{T}} <: Engine{T} # non-owning
ptr::P
len::Int
end
mutable struct ArrayEngine{T, L} <: Engine{T} # owning, must have a static size
data::NTuple{L, T}
end
I am not sure how to proceed. Upon generally reading the source code of StrideArrayCore, it seems I should write the following code:
@inline Base.unsafe_convert(::Type{Ptr{T}}, A::ArrayEngine{T}) where {T} = Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
@inline Base.pointer(A::ArrayEngine{T}) where {T} = Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
Yes. You can implement getindex
and setindex!
with pointer
and GC.@preserve
.
The non-owning of course does not need GC.@preserve
, as it wouldn't do anything anyway. There should be a GC.@preserve
somewhere protecting the backed memory (unless the GC isn't tracking the memory, e.g. you have a Libc.malloc
ed pointer).
This is my first attempt, and I'm not sure if I made any mistakes or did something unnecessary.
abstract type Engine{T} <: DenseVector{T} end
@inline Base.IndexStyle(::Type{<:Engine}) = IndexLinear()
@inline Base.elsize(::Engine{T}) where {T} = sizeof(T)
struct ViewEngine{T, P} <: Engine{T} # non-owning
ptr::P
len::Int
end
@inline function ViewEngine(ptr::Ptr{T}, len::Int) where {T}
return ViewEngine{T, typeof(ptr)}(ptr, len)
end
@inline function ViewEngine(A::AbstractArray)
p = LayoutPointers.memory_reference(A)[1] # not sure what this does
return ViewEngine(p, length(A))
end
@inline Base.pointer(A::ViewEngine) = getfield(A, :ptr)
@inline function Base.unsafe_convert(p::Type{<:Ref{T}}, A::ViewEngine{T}) where {T}
return Base.unsafe_convert(p, pointer(A))
end
@inline Base.size(A::ViewEngine) = tuple(getfield(A, :len))
@inline Base.length(A::ViewEngine) = getfield(A, :len)
@inline function Base.getindex(A::ViewEngine{T}, i::Integer) where {T}
@boundscheck checkbounds(A, i)
return unsafe_load(pointer(A), i)
end
@inline function Base.setindex!(A::ViewEngine{T}, val, i::Integer) where {T}
@boundscheck checkbounds(A, i)
unsafe_store!(pointer(A), val, i)
return val
end
@inline ManualMemory.preserve_buffer(::ViewEngine) = nothing
mutable struct ArrayEngine{T, L} <: DenseVector{T} # owning
data::NTuple{L, T}
@inline ArrayEngine{T, L}(::UndefInitializer) where {T, L} = new{T, L}()
@inline function ArrayEngine{T}(::UndefInitializer, ::StaticInt{L}) where {T, L}
return ArrayEngine{T, L}(undef)
end
end
@inline function Base.unsafe_convert(::Type{Ptr{T}}, A::ArrayEngine{T}) where {T}
return Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
end
@inline function Base.pointer(A::ArrayEngine{T}) where {T}
return Base.unsafe_convert(Ptr{T}, pointer_from_objref(A))
end
@inline Base.size(::ArrayEngine{T, L}) where {T, L} = (L,)
@inline Base.length(::ArrayEngine{T, L}) where {T, L} = L
@inline Base.similar(::ArrayEngine{T, L}) where {T, L} = ArrayEngine{T, L}(undef)
@inline function Base.similar(A::ArrayEngine, ::Type{T}) where {T}
return ArrayEngine{T}(undef, static(length(A)))
end
@inline function ArrayEngine{T}(f::Function, ::StaticInt{L}) where {T, L}
A = ArrayEngine{T, L}(undef)
@inbounds for i in eachindex(A)
A[i] = f(i)
end
return A
end
@inline function ManualMemory.preserve_buffer(A::ArrayEngine)
return ManualMemory.preserve_buffer(getfield(A, :data))
end
Base.@propagate_inbounds function Base.getindex(A::ArrayEngine,
i::Union{Integer, StaticInt})
b = ManualMemory.preserve_buffer(A)
GC.@preserve b begin ViewEngine(A)[i] end
end
Base.@propagate_inbounds function Base.setindex!(A::ArrayEngine, val,
i::Union{Integer, StaticInt})
b = ManualMemory.preserve_buffer(A)
GC.@preserve b begin ViewEngine(A)[i] = val end
end
I'm not quite sure how to properly implement the pointer for the CuTeArray that packs engine
. What I'm doing now is making it point to its engine:
struct CuTeArray{T, N, E <: DenseVector{T}, L <: Layout{N}} <: AbstractArray{T, N}
engine::E
layout::L
function CuTeArray(engine::DenseVector{T}, layout::Layout{N}) where {T, N}
return new{T, N, typeof(engine), typeof(layout)}(engine, layout)
end
end
@inline function Base.unsafe_convert(::Type{Ptr{T}}, A::CuTeArray{T,N,<:ArrayEngine}) where {T,N}
return Base.unsafe_convert(Ptr{T}, pointer_from_objref(engine(A)))
end
@inline function Base.pointer(A::CuTeArray{T,N,<:ArrayEngine}) where {N,T}
return Base.unsafe_convert(Ptr{T}, pointer_from_objref(engine(A)))
end
Is this even correct?
Hi I think that I have got the hang of it so I'm closing this
Thank you for creating this package, and I hope to receive some guidance from you. Recently, I have developed CuTe.jl, which abstracts the shape and stride(s) of an array into a structure called Layout. This abstraction allows us to perform various algebraic operations on the Layout. My goal is to implement something akin to StrideArray, with the distinction that the shape and stride are already parameterized by the Layout, and thus, the data only needs to be a one-dimensional vector. Moreover, the length of the vector should also be part of the type parameters. Do you have any suggestions or advice?
Also, you are welcome the have a glance at the source code of CuTe.jl for any apparent mistakes.