Bioconductor / DelayedArray

A unified framework for working transparently with on-disk and in-memory array-like datasets
https://bioconductor.org/packages/DelayedArray
24 stars 9 forks source link

Subsetting with "missing" arguments #65

Closed allisonvuong closed 4 years ago

allisonvuong commented 4 years ago

Hi,

Matrix subsetting in base R supports "missing" subsetting arguments, but it seems the DelayedMatrix may not.

For example,

m <- matrix(runif(300000), nrow=10000, ncol=30)
M <- DelayedArray(m)

foo <- function(object, i, j) {
    object[i, j]
}

foo(m, j=!logical(ncol(m)))
foo(M, j=!logical(ncol(M)))

Is it possible to support this? I think packages like limma::subsetListOfArrays rely on such subsetting support for subsetting its EList objects.

Best, Allison

LTLA commented 4 years ago

This seems to be an issue caused by S4 dispatch. You will see the same effect with sparse matrices:

library(Matrix)
n <- rsparsematrix(nrow=10000, ncol=30, density=0.1)
foo(n, j=!logical(ncol(n)))
## Error in object[i, j] : argument "i" is missing, with no default

My guess is that the S4 dispatch mechanism needs to know the class of i to choose an appropriate method... and i doesn't exist, leading to the observed error. While S4 dispatch does support missing arguments, it seems that only immediate missingness is supported and it is not propagated from symbols, for example:

library(SingleCellExperiment)
example(SingleCellExperiment, echo=FALSE)

bar <- function(x, i) reducedDim(x, i)
reducedDim(sce) # no problems
bar(sce)
## Error in reducedDim(x, i) : argument "i" is missing, with no default

The immediate fix in your example is to handle the missingness in foo:

foo <- function(object, i, j) {
    args <- list(x=object)
    if (!missing(i)) args$i <- i
    if (!missing(j)) args$j <- j
    do.call("[", args)
}

Not that it really matters because limma can't handle non-ordinary matrices anyway, so you might as well just coerce it to an ordinary matrix and be done with it.

Session information ``` R version 4.0.0 Patched (2020-05-01 r78341) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.4 LTS Matrix products: default BLAS: /home/luna/Software/R/R-4-0-branch-dev/lib/libRblas.so LAPACK: /home/luna/Software/R/R-4-0-branch-dev/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] DelayedArray_0.15.6 IRanges_2.23.10 S4Vectors_0.27.12 [4] BiocGenerics_0.35.4 matrixStats_0.56.0 Matrix_1.2-18 [7] edgeR_3.31.4 limma_3.45.7 loaded via a namespace (and not attached): [1] compiler_4.0.0 tools_4.0.0 Rcpp_1.0.4.6 grid_4.0.0 [5] locfit_1.5-9.4 lattice_0.20-41 ```
allisonvuong commented 4 years ago

Hi,

Okay, thank you!

Best, Allison

hpages commented 4 years ago

Thanks @LTLA for your help with this. Closing now.