Open sairus7 opened 1 year ago
Have you considered JLD or JLD2 which focus on serializing Julia types to HDF5?
How should the tuple be represented in the HDF5 file?
So, HDF5 doesn't support tuples by its type model.
But for this particular case (homogenous ntuple) I'd expect it to encode ntuples as a statically-sized arrays.
However, it throws errors both with StaticArrays and with arrays of arrays:
using StaticArrays, HDF5
vec = [5,6]
svec = SVector(5, 6)
h5open("test.h5", "w") do h
write_dataset(h, "ntup", [ntup]) # works fine
write_dataset(h, "vec", [vec]) # errors
write_dataset(h, "svec", [svec]) # errors
end
The clearest path for me would be the ntuple
case. In the following example, I add a method so that HDF5.jl can figure out the correct HDF5 type that corresponds with it.
julia> import HDF5.hdf5_type_id
julia> hdf5_type_id(::Type{NTuple{N,T}}) where {N,T} = HDF5.API.h5t_array_create(hdf5_type_id(T), 1, [N])
hdf5_type_id (generic function with 17 methods)
julia> datatype(NTuple{10, Int})
HDF5.Datatype: H5T_ARRAY {
[10] H5T_STD_I64LE
}
julia> datatype((1,2,3))
HDF5.Datatype: H5T_ARRAY {
[3] H5T_STD_I64LE
}
We might be able to support SizedArrays
from StaticArrays
through a package extension, but we would need to figure out how to differentiate between the user wanting to write an array of elements or an element that is an array.
Thanks, now I can write ntuples. Is there similar way to declare target type to read back in from HDF5?
using HDF5
import HDF5.hdf5_type_id
hdf5_type_id(::Type{NTuple{N,T}}) where {N,T} = HDF5.API.h5t_array_create(hdf5_type_id(T), 1, [N])
tup = (5, 6)
h5open("test.h5", "w") do h
write_dataset(h, "tup", [tup, tup, tup])
end
# reads back vector of vectors
d = h5open("test.h5", "r") do h
read_dataset(h, "tup")
end
You can do this:
julia> out = Vector{typeof(tup)}(undef, 3)
3-element Vector{Tuple{Int64, Int64}}:
(7277816999743324160, 7205759405420183552)
(7205759405386629120, 8358680910027030528)
(8286623315989102592, 8358680910094139392)
julia> # reads back vector of tuples
d = h5open("test.h5", "r") do h
read_dataset(h["tup"], datatype(tup), out)
end
julia> out
3-element Vector{Tuple{Int64, Int64}}:
(5, 6)
(5, 6)
(5, 6)
This example works with named tuples but fails with ordinary tuples.
Shows error: