Closed LTLA closed 1 month ago
None of this is really expected to work properly at the moment. There's a long road before we can get there (see https://github.com/Bioconductor/DelayedArray/blob/devel/TODO). I'll coordinate with you when it's time to make the switch from SparseArraySeed to SVT_SparseArray. In the meantime, please ignore extract_sparse_array()
and read_block_as_sparse()
.
Oh. I started using it in BioC-devel a few weeks ago. Seemed okay.
More specifically, beachmat now uses extract_sparse_array()
to extract SVT_SparseMatrix
objects for abstract R matrices that the new tatami interface doesn't otherwise know how to handle. SingleR - the only package currently updated to use the new interface - seems pretty happy with it, so I assumed we were ready to go.
Guess I'll just leave it in there for now. Looks mature enough that it'll be in good shape for the next BioC release.
The migration from SparseArraySeed to SVT_SparseArray/COO_SparseArray objects happened in S4Arrays 1.5.3 and DelayedArray 0.31.5.
Currently,
read_sparse_block()
returns the oldSparseArraySeed
, so I figured I would try updating some of my packages to the new SVT world. It seemsSparseArray::extract_sparse_array()
orSparseArray::read_block_as_sparse()
are the replacements, but I ran into a few hiccups as I tried to switch alabaster.matrix to use them.1.
read_sparse_block()
works with not justDelayedArray
s, but also the underlying delayed "layers". This is no longer the case with eitherextract_sparse_array()
orread_block_as_sparse()
.This is an issue as I sometimes have to "unwrap" delayed objects if I know a more efficient operation can be applied in the presence of certain delayed operations, avoiding the need to use general-purpose block processing.
The current workaround is to just re-wrap everything in a
DelayedArray
before callingextract_sparse_array()
. But it would be nice to have everything just work off the bat, which would save me a little bit of boilerplate.2. The new functions discard structural elements with a value of zero.
This causes some back-compatibility problems with old code that previously expected zeros to be counted, e.g., I have a piece of code that uses
blockApply()
(which callsread_sparse_block()
IIUC) and thus counts the zeros, causing mismatching numbers of zeros when interacting with new code that usesextract_sparse_array()
and omits those zeros.More conceptually; it is not impossible to imagine an application that encodes some information in the explicit presence of zeros, e.g., a sparse matrix of log-fold changes for SNPs where an explicit structural zero value indicates no difference and an absent entry indicates that the SNP was not measured in a particular sample. So perhaps it would be better to hold onto to zeros.
FWIW the various sparse Matrix classes preserve zeros, so being consistent with them would be worthwhile if nothing else.
Session information
``` R version 4.3.0 Patched (2023-05-04 r84398) Platform: aarch64-apple-darwin22.3.0 (64-bit) Running under: macOS Ventura 13.2.1 Matrix products: default BLAS: /Users/luna/Software/R/R-4-3-branch/lib/libRblas.dylib LAPACK: /Users/luna/Software/R/R-4-3-branch/lib/libRlapack.dylib; LAPACK version 3.11.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: America/Los_Angeles tzcode source: internal attached base packages: [1] stats4 stats graphics grDevices utils datasets methods [8] base other attached packages: [1] HDF5Array_1.29.3 rhdf5_2.45.0 DelayedArray_0.27.7 [4] SparseArray_1.1.10 S4Arrays_1.1.4 IRanges_2.35.2 [7] S4Vectors_0.39.1 MatrixGenerics_1.13.0 matrixStats_1.0.0 [10] BiocGenerics_0.47.0 Matrix_1.5-4.1 alabaster.matrix_1.1.3 [13] testthat_3.1.9 loaded via a namespace (and not attached): [1] jsonlite_1.8.7 compiler_4.3.0 crayon_1.5.2 [4] brio_1.1.3 Rcpp_1.0.10 rhdf5filters_1.13.3 [7] lattice_0.21-8 R6_2.5.1 XVector_0.41.1 [10] alabaster.base_1.1.0 desc_1.4.2 rprojroot_2.0.3 [13] pillar_1.9.0 rlang_1.1.1 utf8_1.2.3 [16] pkgload_1.3.2 cli_3.6.1 withr_2.5.0 [19] magrittr_2.0.3 Rhdf5lib_1.23.0 zlibbioc_1.47.0 [22] grid_4.3.0 alabaster.schemas_1.1.2 lifecycle_1.0.3 [25] vctrs_0.6.3 waldo_0.5.1 glue_1.6.2 [28] fansi_1.0.4 tools_4.3.0 ```