bnprks / BPCells

Scaling Single Cell Analysis to Millions of Cells
https://bnprks.github.io/BPCells
Other
150 stars 17 forks source link

respect `matrix_type` when coerced into matrix #77

Closed Yunuuuu closed 6 months ago

Yunuuuu commented 7 months ago
library(BPCells)
mock_matrix <- function(ngenes, ncells) {
    cell.means <- 2^stats::runif(ngenes, 2, 10)
    cell.disp <- 100 / cell.means + 0.5
    cell.data <- matrix(stats::rnbinom(ngenes * ncells,
        mu = cell.means,
        size = 1 / cell.disp
    ), ncol = ncells)
    rownames(cell.data) <- sprintf("Gene_%s", formatC(seq_len(ngenes),
        width = 4, flag = 0
    ))
    colnames(cell.data) <- sprintf("Cell_%s", formatC(seq_len(ncells),
        width = 3, flag = 0
    ))
    cell.data
}
tmpdir <- tempdir()
mat <- mock_matrix(30, 20)
path <- normalizePath(tempfile(tmpdir = tmpdir), mustWork = FALSE)
obj <- write_matrix_dir(mat = as(mat, "dgCMatrix"), dir = path)
#> Warning: Matrix compression performs poorly with non-integers.
#> • Consider calling convert_matrix_type if a compressed integer matrix is intended.
#> This message is displayed once every 8 hours.
storage.mode(as.matrix(obj))
#> Warning: Converting to a dense matrix may use excessive memory
#> This message is displayed once every 8 hours.
#> [1] "double"
obj <- convert_matrix_type(obj, "uint32_t")
storage.mode(as.matrix(obj))
#> [1] "integer"

Created on 2024-03-02 with reprex v2.0.2 ~
~
~
~

bnprks commented 7 months ago

Thanks @Yunuuuu, this looks like a nice quality-of-life improvement. One additional request would be if you can add a basic test in test-matrix_utils.R that just confirms that the typing works properly on demo 3x5 matrices? (The generate_dense_matrix helper function will probably be useful to you there). The cases I'd want covered are:

Then confirming everything is the same with expect_identical which is type-sensitive

Yunuuuu commented 7 months ago

Hi, @bnprks. I have added the unit test. Another opinion, do you think should we keep the matrix data type when coercing a dense matrix into the IterableMatrix by adding a new method instead of coercing into dgCMatrix firstly?

bnprks commented 6 months ago

Hi @Yunuuuu, sorry for the delay getting back to this. I've just looked through your changes, made a couple of final edits to tests, and merged back into main.

Your idea about adding a specific method to convert directly from a dense matrix to IterableMatrix is good. I'm not sure how commonly BPCells users will want to do that, but I'd be happy to review a separate pull request if you want to implement it.