I'm trying to convert a H5SparseMatrixSeed-containing DelayedMatrix to a dgCMatrix. This calls:
selectMethod(coerce, c("DelayedMatrix", "dgCMatrix") )
## Method Definition:
##
## function (from, to = "dgCMatrix", strict = TRUE)
## as(as(from, "SparseArraySeed"), "dgCMatrix")
## <bytecode: 0x55555e41dde0>
## <environment: namespace:DelayedArray>
##
## Signatures:
## from to
## target "DelayedMatrix" "dgCMatrix"
## defined "Array" "dgCMatrix"
Well, okay. But then this calls:
selectMethod(coerce, c("DelayedMatrix", "SparseArraySeed") )
## Method Definition:
##
## function (from, to = "SparseArraySeed", strict = TRUE)
## .BLOCK_dense2sparse(from)
## <bytecode: 0x55555e4180f0>
## <environment: namespace:DelayedArray>
##
## Signatures:
## from to
## target "DelayedMatrix" "SparseArraySeed"
## defined "DelayedArray" "SparseArraySeed"
Directly coercing H5SparseMatrixSeeds to SparseArraySeeds isn't much better:
selectMethod(coerce, c("H5SparseMatrixSeed", "SparseArraySeed") )
## Method Definition:
##
## function (from, to = "SparseArraySeed", strict = TRUE)
## dense2sparse(from)
## <bytecode: 0x55555e420978>
## <environment: namespace:DelayedArray>
##
## Signatures:
## from to
## target "H5SparseMatrixSeed" "SparseArraySeed"
## defined "ANY" "SparseArraySeed"```
All in all, this makes it painfully slow to load a H5SparseMatrix into memory after applying some operations on it (e.g., like slapping on dimnames, which is commonly what we get out of SummarizedExperiment::assay()).
The solution is to simply define the missing sparse-to-sparse methods - something like the below might work:
setMethod("coerce", c("H5SparseMatrixSeed", "SparseArraySeed"), function(from) {
extract_sparse_array(from, list(NULL, NULL))
})
setMethod("coerce", c("DelayedMatrix", "SparseArraySeed"), function(from) {
if (is_sparse(from)) {
extract_sparse_array(from, list(NULL, NULL))
} else {
.BLOCK_dense2sparse(from)
}
})
# Ideally we should be able to do something like this, to bypass the intermediate
# SparseArraySeed object and directly return a dgCMatrix. This would enable
# very efficient conversions from CSC_H5SparseMatrixSeed representations,
# while also adjusting for the common occurrence of assay() adding dimnames.
setMethod("coerce", c("DelayedMatrix", "dgCMatrix"), function(from) {
if (isPristine(from, ignore.dimnames=TRUE)) {
output <- as(from@seed, "dgCMatrix")
dimnames(output) <- dimnames(from)
output
} else {
as(as(from, "SparseArraySeed"), "dgCMatrix")
}
})
Session info
```
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] HDF5Array_1.21.0 rhdf5_2.37.0 DelayedArray_0.19.1
[4] IRanges_2.27.0 S4Vectors_0.31.0 MatrixGenerics_1.5.1
[7] matrixStats_0.59.0 BiocGenerics_0.39.1 Matrix_1.3-3
loaded via a namespace (and not attached):
[1] compiler_4.1.0 tools_4.1.0 rhdf5filters_1.5.0 grid_4.1.0
[5] lattice_0.20-44 Rhdf5lib_1.15.2
```
I'm trying to convert a
H5SparseMatrixSeed
-containingDelayedMatrix
to adgCMatrix
. This calls:Well, okay. But then this calls:
Directly coercing
H5SparseMatrixSeed
s toSparseArraySeed
s isn't much better:All in all, this makes it painfully slow to load a
H5SparseMatrix
into memory after applying some operations on it (e.g., like slapping on dimnames, which is commonly what we get out ofSummarizedExperiment::assay()
).The solution is to simply define the missing sparse-to-sparse methods - something like the below might work:
Session info
``` R version 4.1.0 (2021-05-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.5 LTS Matrix products: default BLAS: /usr/local/lib/R/lib/libRblas.so LAPACK: /usr/local/lib/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 stats graphics grDevices utils datasets methods [8] base other attached packages: [1] HDF5Array_1.21.0 rhdf5_2.37.0 DelayedArray_0.19.1 [4] IRanges_2.27.0 S4Vectors_0.31.0 MatrixGenerics_1.5.1 [7] matrixStats_0.59.0 BiocGenerics_0.39.1 Matrix_1.3-3 loaded via a namespace (and not attached): [1] compiler_4.1.0 tools_4.1.0 rhdf5filters_1.5.0 grid_4.1.0 [5] lattice_0.20-44 Rhdf5lib_1.15.2 ```