Bioconductor / MatrixGenerics

S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects
https://bioconductor.org/packages/MatrixGenerics
12 stars 1 forks source link

Problem with the generic for 'rowLogSumExps()' #28

Closed rcastelo closed 2 years ago

rcastelo commented 2 years ago

Hi, I'm facing a problem that I think has to do with the entry in MatrixGenerics for rowLogSumExps(). See the following definition:

library(MatrixGenerics)
showMethods(rowLogSumExps, includeDefs=TRUE)
Function: rowLogSumExps (package MatrixGenerics)
lx="ANY"
function (lx, rows = NULL, cols = NULL, na.rm = FALSE, ..., useNames = NA) 
{
    MatrixGenerics:::.load_next_suggested_package_to_search(x)
    callGeneric()
}

lx="matrix_OR_array_OR_table_OR_numeric"
function (lx, rows = NULL, cols = NULL, na.rm = FALSE, ..., useNames = NA) 
{
    .local <- function (lx, rows = NULL, cols = NULL, na.rm = FALSE, 
        dim. = dim(lx), ..., useNames = NA) 
    {
        matrixStats::rowLogSumExps(lx, rows = rows, cols = cols, 
            na.rm = na.rm, dim. = dim., ..., useNames = useNames)
    }
    .local(lx, rows, cols, na.rm, ..., useNames = useNames)
}

The method for class ANY takes lx as argument, but the associated code tries to call a function with an non-existing object x, which should have been lx. The context in which I'm finding this is in a call to rowLogSumExps() that works interactively, but within a vignette I get this seemingly unrelated error (I mean unrelated to the code in the vignette):

Error in MatrixGenerics:::.load_next_suggested_package_to_search(x) : 
  Failed to find a rowMaxs() method for dgeMatrix objects.
Calls: <Anonymous> ... callGeneric -> eval -> eval -> rowMaxs -> rowMaxs -> <Anonymous>

Execution halted
const-ae commented 2 years ago

Hi Robert,

I have made a pull request that should fix the issue. I have simply replaced the call to the helper function make_default_method_def() with the correct method body for the rowLogSumExps() with lx = ANY function. @PeteHaitch, @hpages, could you take a look and merge the PR if you think that my fix is reasonable?

Best, Constantin

const-ae commented 2 years ago

@rcastelo, I am a bit confused about the "unrelated" error that you have posted:

Failed to find a rowMaxs() method for dgeMatrix objects.

To me, this sounds like some other problem as none of the xxxMatrixStats packages that I am aware of supports dgeMatrix objects. How did you come to the conclusion that these two problems are related?

rcastelo commented 2 years ago

Dear Constantin, thanks for your prompt response. After some more investigation, I found out that you're right that the error referring to rowMaxs() and dgeMatrix objects is unrelated to the original question. However, I think this could be a minimal reproducible example of the original problem:

library(Matrix)
library(sparseMatrixStats)
m <- Matrix(sample(1:100, size=9, replace=TRUE), nrow=3, ncol=3)
m
3 x 3 Matrix of class "dgeMatrix"
     [,1] [,2] [,3]
[1,]   29   50   97
[2,]   21   82   33
[3,]   43   82    6
rowLogSumExps(m)
Error in MatrixGenerics:::.load_next_suggested_package_to_search(x) : 
  object 'x' not found

Tracing the execution confirms this:

trace("rowLogSumExps", browser, browser, signature="dgeMatrix")
[1] "rowLogSumExps"
> rowLogSumExps(m)
Tracing rowLogSumExps(m) on entry 
Called from: eval(expr, p)
Browse[1]> 
debug: {
    MatrixGenerics:::.load_next_suggested_package_to_search(x)
    callGeneric()
}
Browse[2]> lx
3 x 3 Matrix of class "dgeMatrix"
     [,1] [,2] [,3]
[1,]   29   50   97
[2,]   21   82   33
[3,]   43   82    6
Browse[2]> x
Error: object 'x' not found
Browse[2]> 
debug: MatrixGenerics:::.load_next_suggested_package_to_search(x)
Browse[2]> 
Error in paste0("Failed to find a ", genericName, "() method ", "for ",  : 
  object 'x' not found
Tracing rowLogSumExps(m) on exit 
Browse[3]> 

One more question, I guess that, as you say, none of the xxxMatrixStats functions/methods support dgeMatrix objects because they are not sparse, but in such a case, shouldn't the code fall back to the corresponding non-sparse function/method? I'm saying this thinking about the case of rowMaxs().

const-ae commented 2 years ago

Hi Robert,

thanks for the reproducible example. Yes, you are right there is a bug in rowLogSumExps() and fix_28 addresses that. However, support for dgeMatrix objects is a different problem, which would require going through each function and adding a conversion of dgeMatrix to matrix. There was some talk about conversions with as.matrix() back when the package was developed (https://github.com/Bioconductor/MatrixGenerics/issues/4#issuecomment-537716552).

See the following reprex for an overview:

library(Matrix)
library(MatrixGenerics)

m <- matrix(rpois(n = 3 * 5, lambda = 0.2), nrow = 3, ncol = 5)
rowLogSumExps(m)
#> [1] 1.904832 1.609438 1.609438
# The following only works if you install the pull request
# devtools::install_github("const-ae/MatrixGenerics@fix_28")
sp_mat <- as(m, "dgCMatrix")
rowLogSumExps(sp_mat)
#> [1] 1.904832 1.609438 1.609438
# dgeMatrix is currently not supported. But we should consider adding support by converting it with as.matrix()
dge_mat <- as(m, "dgeMatrix")
rowLogSumExps(dge_mat)
#> Error in MatrixGenerics:::.load_next_suggested_package_to_search(lx): Failed to find a rowLogSumExps() method for dgeMatrix objects.
rowVars(dge_mat) # Failure is not specific to `rowLogSumExps`
#> Error in MatrixGenerics:::.load_next_suggested_package_to_search(x): Failed to find a rowVars() method for dgeMatrix objects.

Created on 2022-06-08 by the reprex package (v2.0.1)

rcastelo commented 2 years ago

I see, thanks for the pointers. I'll take care then to track whenever a sparse matrix becomes dense (e.g., when taking logs) and then call the appropriate API. Please close this issue when your PR fixing the bug before is merged.

PeteHaitch commented 2 years ago

Thanks for bringing this to our attention, @rcastelo.

@const-ae: https://github.com/Bioconductor/MatrixGenerics/pull/29 looks good to me, but by that logic I think similar fixes also need to be made to some other functions that don't take x as their first argument:

@hpages: Could you please confirm, because I don't properly understand MatrixGenerics:::.load_next_suggested_package_to_search().

We should also take the example that @rcastelo provided to add unit tests for this behaviour, although I'm not sure if these tests belong in sparseMatrixStats or MatrixGenerics?

PeteHaitch commented 2 years ago

Thanks @rcastelo and @const-ae. I've pushed @const-ae's fix to BioC. Should be available from BiocManager::install() as v1.9.1 in 1-2 days. I will also backport this to the current release branch.