Closed LTLA closed 1 year ago
Sounds good. Would it be ok if the user specified the layout ("CSC"
or "CSR"
) instead of the class to return ("CSC_H5SparseMatrixSeed"
or "CSR_H5SparseMatrixSeed"
)?
Done in HDF5Array 1.29.3 (see 41fe4b17c7822a1d29f0bf03d89c79aabce94bcc).
Does that work?
Thanks, that looks great. Need to update my local R libs to check it out, but it's pretty much what I was thinking anyway.
Looking at the
H5SparseMatrixSeed
source code, it seems like it would be straightforward to allow users to specifydim
andans_class
, rather than extracting them from the file. This would allow theH5SparseMatrixSeed
constructor to work with any compressed sparse matrix stored in a HDF5 group that hasdata
,indices
andindptr
, provided that the user can specify the dimensions and the row/column layout. (Of course, if these are not supplied, then they can be automatically inferred.)This request is motivated by the desire to avoid the H5AD formats, which are very confusing to explain in an R context, e.g., a matrix labelled as a
csr_matrix
inside the file is instead a CSC matrix in R. If my application already knows that a matrix is CSC/R, then I can just pass that knowledge directly to the constructor, rather than doing a mental double transposition to trickH5SparseMatrixSeed
into doing the right thing. (One transposition for data writers to label the CSC matrix ascsr
, and then another transposition for data readers - not necessarily using HDF5Array - to undo the transposition to loadcsr
as CSC.)