Closed LTLA closed 2 months ago
Right, you need to use nzwhich()
instead of nzcoo()
:
z[nzwhich(y)] <- nzvals(y)
identical(z, t(basic))
# [1] TRUE
nzcoo()
and nzdata()
are slot accessors for COO_SparseArray objects.
nzwhich()
and nzvals()
are generic functions whose behavior is independent of the internal representation of the sparse array e.g. they work on SVT_SparseArray, dgCMatrix, lgCMatrix objects etc...
Only when COO_SparseArray object x
is normalized (i.e. nzcoo
slot is strictly ordered and nzdata
contains no zeros) will nzwhich(x, arr.ind=TRUE)
and nzvals(x)
return the same things as nzcoo(x)
and nzdata(x)
.
H.
or use nzdata()
instead of nzvals()
:
z[nzcoo(y)] <- nzdata(y)
identical(z, t(basic))
# [1] TRUE
Ah, okay. I'll switch whichNonZero()
to just use nzwhich()
and nzvals()
under the hood, then.
That said, some of my tests still fail:
library(SparseArray)
library(DelayedArray)
stuff <- Matrix::rsparsematrix(1000, 1000, density=0.01)
wrapped <- DelayedArray(stuff)
nzwhich(wrapped)
## Error in extract_sparse_array(x@seed, index) : NOT IMPLEMENTED YET!
## In addition: Warning message:
## In which(is_nonzero | is.na(is_nonzero), arr.ind = arr.ind, useNames = FALSE) :
## 'useNames' is ignored when 'x' is a DelayedArray object or derivative
I would have expected nzwhich()
to work on any sparse array-ish object. Interestingly, it behaves as expected for a dense DelayedArray
, albeit with a noisy warning:
dwrapped <- DelayedArray(as.matrix(stuff))
str(nzwhich(dwrapped))
## int [1:10000] 63 133 338 452 563 579 731 844 912 1398 ...
## Warning message:
## In which(is_nonzero | is.na(is_nonzero), arr.ind = arr.ind, useNames = FALSE) :
## 'useNames' is ignored when 'x' is a DelayedArray object or derivative
mmh.. so it seems that 2 components are missing to make nzwhich()
work on a sparse DelayedArray object: (1) an extract_sparse_array()
method for DelayedNaryIsoOp objects, and (2) a which()
method for SVT_SparseArray objects. Working on it now.
Finally I went for a dedicated nzwhich()
method for DelayedArray objects. Should be slightly more efficient than relying on the default nzwhich()
method:
library(DelayedArray)
set.seed(2009)
stuff <- Matrix::rsparsematrix(1000, 1000, density=0.01)
wrapped <- DelayedArray(stuff)
str(nzwhich(wrapped))
# int [1:10000] 76 124 157 250 554 812 985 1123 1298 1320 ...
identical(nzwhich(wrapped), nzwhich(stuff))
# [1] TRUE
This is in DelayedArray 0.31.7 (see https://github.com/Bioconductor/DelayedArray/commit/cf7427b72cd6df7715b0d962927869a6da2fee15).
The 2 methods I mentioned above (extract_sparse_array(<DelayedNaryIsoOp>)
and which(<SVT_SparseArray>)
) are still missing but that will have to wait for now.
Looks like
nzvals
does some work in.normalize_COO_SparseArray
that may not match up tonzcoo()
's output.Session information
``` R version 4.4.0 Patched (2024-05-20 r86569) Platform: x86_64-pc-linux-gnu Running under: Ubuntu 22.04.4 LTS Matrix products: default BLAS: /home/luna/Software/R/R-4-4-branch/lib/libRblas.so LAPACK: /home/luna/Software/R/R-4-4-branch/lib/libRlapack.so; LAPACK version 3.12.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C time zone: America/Los_Angeles tzcode source: system (glibc) attached base packages: [1] stats4 stats graphics grDevices utils datasets methods [8] base other attached packages: [1] SparseArray_1.5.16 S4Arrays_1.5.3 IRanges_2.39.1 [4] abind_1.4-5 S4Vectors_0.43.1 MatrixGenerics_1.17.0 [7] matrixStats_1.3.0 BiocGenerics_0.51.0 Matrix_1.7-0 loaded via a namespace (and not attached): [1] zlibbioc_1.51.1 compiler_4.4.0 tools_4.4.0 XVector_0.45.0 [5] crayon_1.5.3 grid_4.4.0 lattice_0.22-6 ```