rafaqz / DimensionalData.jl

Named dimensions and indexing for julia arrays and other data
https://rafaqz.github.io/DimensionalData.jl/stable/
MIT License
262 stars 38 forks source link

error showing DimArray #670

Open bjarthur opened 3 months ago

bjarthur commented 3 months ago

i'm getting a "ERROR: ArgumentError: Chunk sizes must be strictly positive" error, yet the chunk sizes seem to be positive.

julia> using DimensionalData, YAXArrays, YAXArrayBase, Zarr, DiskArrays

julia> A = rand(X(6), Y(5), Z(4), metadata = Dict{String, Any}())
╭───────────────────────────╮
│ 6×5×4 DimArray{Float64,3} │
├───────────────────── dims ┤
  ↓ X, → Y, ↗ Z
├───────────────── metadata ┤
  Dict{String, Any}()
└───────────────────────────┘
[:, :, 1]
 0.784275  0.125566   0.397801  0.807615   0.393778
 0.255679  0.32185    0.919381  0.932538   0.974425
 0.997114  0.577818   0.808486  0.0384444  0.56319
 0.589664  0.661931   0.38077   0.037413   0.0355789
 0.440041  0.0455827  0.601652  0.165645   0.619262
 0.810021  0.187474   0.585673  0.509805   0.624613

julia> B = yaxconvert(YAXArray, A)
╭───────────────────────────╮
│ 6×5×4 YAXArray{Float64,3} │
├───────────────────── dims ┤
  ↓ X, → Y, ↗ Z
├───────────────── metadata ┤
  Dict{String, Any}()
├──────────────── file size ┤ 
  file size: 960.0 bytes
└───────────────────────────┘

julia> C = setchunks(B, (2,5,4))
╭───────────────────────────╮
│ 6×5×4 YAXArray{Float64,3} │
├───────────────────── dims ┤
  ↓ X, → Y, ↗ Z
├───────────────── metadata ┤
  Dict{String, Any}()
├──────────────── file size ┤ 
  file size: 960.0 bytes
└───────────────────────────┘

julia> savecube(C, "foo.zarr", driver=:zarr)
╭───────────────────────────╮
│ 6×5×4 YAXArray{Float64,3} │
├───────────────────── dims ┤
  ↓ X, → Y, ↗ Z
├───────────────── metadata ┤
  Dict{String, Any}()
├──────────────── file size ┤ 
  file size: 960.0 bytes
└───────────────────────────┘

julia> D = Cube("foo.zarr")
╭───────────────────────────╮
│ 6×5×4 YAXArray{Float64,3} │
├───────────────────────────┴─────────────────────── dims ┐
  ↓ X Sampled{Int64} 1:1:6 ForwardOrdered Regular Points,
  → Y Sampled{Int64} 1:1:5 ForwardOrdered Regular Points,
  ↗ Z Sampled{Int64} 1:1:4 ForwardOrdered Regular Points
├─────────────────────────────────────────────── metadata ┤
  Dict{String, Any} with 2 entries:
  "name"       => "layer"
  "_FillValue" => 1.0e32
├────────────────────────────────────────────── file size ┤ 
  file size: 960.0 bytes
└─────────────────────────────────────────────────────────┘

julia> E = DimArray(D.data, dims(D))
╭───────────────────────────╮
│ 6×5×4 DimArray{Float64,3} │
├───────────────────────────┴─────────────────────── dims ┐
  ↓ X Sampled{Int64} 1:1:6 ForwardOrdered Regular Points,
  → Y Sampled{Int64} 1:1:5 ForwardOrdered Regular Points,
  ↗ Z Sampled{Int64} 1:1:4 ForwardOrdered Regular Points
└─────────────────────────────────────────────────────────┘
[:, :, 1]
Error showing value of type DimArray{Float64, 3, Tuple{X{DimensionalData.Dimensions.Lookups.Sampled{Int64, StepRange{Int64, Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}, Y{DimensionalData.Dimensions.Lookups.Sampled{Int64, StepRange{Int64, Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}, Z{DimensionalData.Dimensions.Lookups.Sampled{Int64, StepRange{Int64, Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}}, Tuple{}, ZArray{Float64, 3, Zarr.BloscCompressor, DirectoryStore}, DimensionalData.NoName, DimensionalData.Dimensions.Lookups.NoMetadata}:
ERROR: ArgumentError: Chunk sizes must be strictly positive
Stacktrace:
  [1] RegularChunks
    @ ~/.julia/packages/DiskArrays/bZBJE/src/chunks.jl:26 [inlined]
  [2] subsetchunks(r::DiskArrays.RegularChunks, subs::UnitRange{Int64})
    @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/chunks.jl:50
  [3] #87
    @ ./none:0 [inlined]
  [4] iterate
    @ ./generator.jl:47 [inlined]
  [5] grow_to!(dest::Vector{DiskArrays.ChunkType}, itr::Base.Generator{Base.Iterators.Filter{…}, DiskArrays.var"#87#89"{…}})
    @ Base ./array.jl:907
  [6] collect
    @ ./array.jl:831 [inlined]
  [7] eachchunk_view
    @ ~/.julia/packages/DiskArrays/bZBJE/src/subarray.jl:32 [inlined]
  [8] eachchunk
    @ ~/.julia/packages/DiskArrays/bZBJE/src/subarray.jl:27 [inlined]
  [9] (::DiskArrays.var"#66#71")(ar::DiskArrays.SubDiskArray{Float64, 2, ZArray{…}, Tuple{…}, false})
    @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/broadcast.jl:56
 [10] _all(f::DiskArrays.var"#66#71", itr::Vector{AbstractMatrix{Float64}}, ::Colon)
    @ Base ./reduce.jl:1288
 [11] all(f::Function, a::Vector{AbstractMatrix{Float64}}; dims::Function)
    @ Base ./reducedim.jl:1023
 [12] common_chunks(::Tuple{Int64, Int64}, ::SubArray{Float64, 2, Matrix{…}, Tuple{…}, false}, ::Vararg{Any})
    @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/broadcast.jl:56
 [13] copyto!(dest::SubArray{…}, bc::Base.Broadcast.Broadcasted{…})
    @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/broadcast.jl:37
 [14] materialize!
    @ ./broadcast.jl:914 [inlined]
 [15] materialize!
    @ ./broadcast.jl:911 [inlined]
 [16] _copyto!(dest::Matrix{…}, Rdest::CartesianIndices{…}, src::DiskArrays.SubDiskArray{…}, Rsrc::CartesianIndices{…})
    @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/array.jl:64
 [17] copyto!
    @ ~/.julia/packages/DiskArrays/bZBJE/src/array.jl:17 [inlined]
 [18] _print_matrix(io::IOContext{…}, A::DiskArrays.SubDiskArray{…}, lookups::Tuple{…})
    @ DimensionalData ~/.julia/packages/DimensionalData/vXseP/src/array/show.jl:289
 [19] print_matrix
    @ ~/.julia/packages/DimensionalData/vXseP/src/array/show.jl:240 [inlined]
 [20] print_array(io::IOContext{…}, mime::MIME{…}, A::DimArray{…})
    @ DimensionalData ~/.julia/packages/DimensionalData/vXseP/src/array/show.jl:208
 [21] show_after(io::IOContext{…}, mime::MIME{…}, A::DimArray{…})
    @ DimensionalData ~/.julia/packages/DimensionalData/vXseP/src/array/show.jl:76
 [22] show(io::IOContext{…}, mime::MIME{…}, A::DimArray{…})
    @ DimensionalData ~/.julia/packages/DimensionalData/vXseP/src/array/show.jl:17
 [23] (::REPL.var"#55#56"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:273
 [24] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [25] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:259
 [26] display
    @ ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:278 [inlined]
 [27] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:340
 [28] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{…})
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:0
 [29] (::REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:284
 [30] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [31] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:282
 [32] (::REPL.var"#do_respond#80"{…})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:911
 [33] (::REPL.var"#98#108"{…})(::REPL.LineEdit.MIState, ::Any, ::Vararg{…})
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:1248
 [34] #invokelatest#2
    @ ./essentials.jl:892 [inlined]
 [35] invokelatest
    @ ./essentials.jl:889 [inlined]
 [36] (::VimBindings.var"#8#10"{REPL.var"#98#108"{…}, String})(s::REPL.LineEdit.MIState, p::REPL.LineEditREPL)
    @ VimBindings ~/.julia/dev/VimBindings/src/lineeditalt.jl:69
 [37] prompt!(term::REPL.Terminals.TTYTerminal, prompt::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ VimBindings ~/.julia/dev/VimBindings/src/lineeditalt.jl:34
 [38] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/LineEdit.jl:2651
 [39] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:1312
 [40] (::REPL.var"#62#68"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ~/.julia/juliaup/julia-1.10.2+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:386
Some type information was truncated. Use `show(err)` to see complete types.

julia> DiskArrays.eachchunk(E.data)
3×1×1 DiskArrays.GridChunks{3, Tuple{DiskArrays.RegularChunks, DiskArrays.RegularChunks, DiskArrays.RegularChunks}}:
[:, :, 1] =
 (1:2, 1:5, 1:4)
 (3:4, 1:5, 1:4)
 (5:6, 1:5, 1:4)
bjarthur commented 3 months ago
julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 12 × Apple M2 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
  JULIA_PROJECT = @.
  JULIA_EDITOR = vi

(jl_a3hv6U) pkg> st
Status `/private/var/folders/s5/8d629n5d7nsf37f60_91wzr40000gq/T/jl_a3hv6U/Project.toml`
  [0703355e] DimensionalData v0.26.3
⌅ [3c3547ce] DiskArrays v0.3.23
  [90b8fcef] YAXArrayBase v0.6.1
  [c21b50f5] YAXArrays v0.5.4
  [0a941bbe] Zarr v0.9.2
Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated`

(jl_a3hv6U) pkg> st --outdated
Status `/private/var/folders/s5/8d629n5d7nsf37f60_91wzr40000gq/T/jl_a3hv6U/Project.toml`
⌅ [3c3547ce] DiskArrays v0.3.23 (<v0.4.1): DiskArrayTools, Zarr
rafaqz commented 3 months ago

DD doesnt know anything about chunks...

If you cant reproduce a bug with a basic DimArray or other objects defined here, take the issue to the extending package first, in this case YAXArrays.jl.

The problem is in the internal object, not DD

lazarusA commented 3 months ago

I guess the only way will be to bring the data into memory first:

E = DimArray(D.data[:,:,:], dims(D))
felixcremer commented 3 months ago

This is not a YAXArray issue, because YAXArrays is only used to construct the data and to open the underlying Zarr data. The object that is failing is a normal DimArray with a DiskArray as data. This seems to rather be an issue with calling copyto! on a DiskArray but I am wondering why this is not a problem on NetCDF or GDAL files. This call happens in the print_matrix function of DD.

It might be that YAXArrays is constructing a slightly broken zarr file here.

Do you also run into this problem on actual data that you get from somewhere else or is it only a problem when you save the data with YAXArrays?

rafaqz commented 3 months ago

Either DiskArrays.jl not working with some base method that works on normal arrays, or YAX constructing something broken. But I assumed it was YAX making a broken disk array.

felixcremer commented 2 months ago

I looked into it a bit more and it seems to be an issue DiskArrays not handling the copyto! case when one of the indices is length zero. This happens when the size of the terminal is too small to show something but it still tries to retrieve some data. I will open a DiskArrays.jl issue.