bnprks / BPCells

Scaling Single Cell Analysis to Millions of Cells
https://bnprks.github.io/BPCells
Other
167 stars 17 forks source link

as(iterablematrix, 'IterableMatrix') #23

Closed brgew closed 1 year ago

brgew commented 1 year ago

Hi Ben,

Me again. I am seeing BPCells from a different perspective suddenly. I am wondering if it makes sense to wrap our dgCMatrix instances in IterableMatrix wrappers and use the wrapped matrices in place of the dgCMatrices. I suppose it depends on having all of the expected methods. I will run some tests.

So I wondered what happens in the case that I try to wrap an IterableMatrix in an IterableMatrix using R commands like

v1 <- c(1,2,3,4,5,6)
v2 <- c(7,8,9,10,11,12)

m1 <- matrix(v1, nrow=2)
m2 <- matrix(v2, nrow=2)

sm1 <- as(m1, 'dgCMatrix')
sm2 <- as(m2, 'dgCMatrix')

bpm1 <- as(sm1, 'IterableMatrix')
bpm2 <- as(bpm1, 'IterableMatrix')

bpm1 <- as(sm1, 'IterableMatrix')
bpm2 <- as(bpm1, 'IterableMatrix')

> str(bpm1)
Formal class 'Iterable_dgCMatrix_wrapper' [package "BPCells"] with 4 slots
  ..@ mat      :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. ..@ i       : int [1:6] 0 1 0 1 0 1
  .. .. ..@ p       : int [1:4] 0 2 4 6
  .. .. ..@ Dim     : int [1:2] 2 3
  .. .. ..@ Dimnames:List of 2
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. .. ..@ x       : num [1:6] 1 2 3 4 5 6
  .. .. ..@ factors : list()
  ..@ dim      : int [1:2] 2 3
  ..@ transpose: logi FALSE
  ..@ dimnames :List of 2
  .. ..$ : NULL
  .. ..$ : NULL
> str(bpm2)
Formal class 'IterableMatrix' [package "BPCells"] with 3 slots
  ..@ dim      : int [1:2] 2 3
  ..@ transpose: logi FALSE
  ..@ dimnames :List of 2
  .. ..$ : NULL
  .. ..$ : NULL

It appears that the bpm2 lost some important information whereas I expected it to be identical to bpm1.

The Matrix package behave more in line with my expectation, for example,

v <- c(1,2,3,4,5,6)
m <- matrix(v1, nrow=2)
sm1 <- as(m, 'dgCMatrix')
sm2 <- as(sm1, 'dgCMatrix')

str(sm1)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:6] 0 1 0 1 0 1
  ..@ p       : int [1:4] 0 2 4 6
  ..@ Dim     : int [1:2] 2 3
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:6] 1 2 3 4 5 6
  ..@ factors : list()

str(sm2)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:6] 0 1 0 1 0 1
  ..@ p       : int [1:4] 0 2 4 6
  ..@ Dim     : int [1:2] 2 3
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:6] 1 2 3 4 5 6
  ..@ factors : list()

Of course, I can test whether or not the input matrix is already an 'IterableMatrix' too.

Ever grateful, Brent

bnprks commented 1 year ago

I believe this is the intended default behavior for the as function in R -- when given a subclass of IterableMatrix it will strip the object down to just the fields that are present in the IterableMatrix base class, hence losing information.

I believe if you don't want this behavior you can instead do as(bpm1, "IterableMatrix", strict=FALSE) which won't modify any variables that are a subclass of IterableMatrix.

brgew commented 1 year ago

Hi Ben,

Thank you again! I apologize for bothering you...

Ever grateful, Brent