meggart / DiskArrays.jl

Other
72 stars 13 forks source link

[WIP] issue DiskArrays.jl #131 #132

Closed Alexander-Barth closed 4 months ago

Alexander-Barth commented 10 months ago

In this PR, fixes the following issue

a1 = _DiskArray(zeros(5,5,10));
size(a1[[1,2],[2,3],:])
# previous output (2, 3, 2)
# now  (2,2,3)

also it is now type stable

ulia> using NCDatasets; v = NCDataset("/tmp/sample2.nc")["data"].var;
[ Info: Precompiling NCDatasets [85f8d34a-cbdd-5861-8df4-14fed0d494ab]

julia> foo(v) = v[:,:,[1]];

julia> @code_warntype foo(v)
MethodInstance for foo(::NCDatasets.Variable{Float32, 3, NCDataset{Nothing}})
  from foo(v) @ Main REPL[2]:1
Arguments
  #self#::Core.Const(foo)
  v::NCDatasets.Variable{Float32, 3, NCDataset{Nothing}}
Body::Array{Float32, 3}
1 ─ %1 = Main.:(:)::Core.Const(Colon())
│   %2 = Main.:(:)::Core.Const(Colon())
│   %3 = Base.vect(1)::Vector{Int64}
│   %4 = Base.getindex(v, %1, %2, %3)::Array{Float32, 3}
└──      return %4

julia> @which v[:,:,[1]]
getindex(a::NCDatasets.Variable, i...)
     @ NCDatasets ~/.julia/dev/DiskArrays/src/diskarray.jl:211

It avoids also reading too much of data.

I had to set the need_batch = false to avoid infinite recursion. Any advice would be appreciated.

The implementation in NCDatasets is added to current implementation of the special case:

batchgetindex(a::TA,indices::Vararg{Union{Int,Colon,AbstractRange{<:Integer},Vector{Int}},N}) where TA <: AbstractArray{T,N} where {T,N}

to that we do not remove any functionality.

meggart commented 4 months ago

I think the issues mentioned here have been fixed in other PRs

Alexander-Barth commented 4 months ago

Thanks Fabian!