Closed pedriniedoardo closed 1 week ago
Hi @pedriniedoardo can you double-check what the class of ref@graphs$RNA_snn
is?
hello @MikeDMorgan
> class(ref@graphs$RNA_snn)
[1] "Graph"
attr(,"package")
[1] "SeuratObject"
buildFromAdjacency expects a matrix as input, i.e. an adjacency matrix. Without delving into Seurat, it's not clear what this SeuratObject
/Graph
object representation is. Try casting to a dense matrix format to see if you get the same memory spike, then compare to casting to a sparse matrix format.
Thank you for the suggestion @MikeDMorgan !
When I look at the structure of the graph_obj extracted from seurat ref@graphs$RNA_snn
it definitely looks like a sparse matrix to me...
> str(graph_obj)
Formal class 'Graph' [package "SeuratObject"] with 7 slots
..@ assay.used: chr "RNA"
..@ i : int [1:8925812] 0 731 3101 3656 5920 7085 11523 12065 12548 16391 ...
..@ p : int [1:132737] 0 35 62 157 202 276 362 433 488 560 ...
..@ Dim : int [1:2] 132736 132736
..@ Dimnames :List of 2
.. ..$ : chr [1:132736] "108_AAACCCAAGATACCAA-1" "108_AAACCCAAGGCCCGTT-1" "108_AAACCCACACAGTACT-1" "108_AAACCCACAGGGACTA-1" ...
.. ..$ : chr [1:132736] "108_AAACCCAAGATACCAA-1" "108_AAACCCAAGGCCCGTT-1" "108_AAACCCACACAGTACT-1" "108_AAACCCACAGGGACTA-1" ...
..@ x : num [1:8925812] 1 0.0811 0.1111 0.0811 0.1429 ...
..@ factors : list()
I therefore changed the class of the object using as
.
> test_sparse <- as(graph_obj, "TsparseMatrix")
> str(test_sparse)
Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:8925812] 0 731 3101 3656 5920 7085 11523 12065 12548 16391 ...
..@ j : int [1:8925812] 0 0 0 0 0 0 0 0 0 0 ...
..@ Dim : int [1:2] 132736 132736
..@ Dimnames:List of 2
.. ..$ : chr [1:132736] "108_AAACCCAAGATACCAA-1" "108_AAACCCAAGGCCCGTT-1" "108_AAACCCACACAGTACT-1" "108_AAACCCACAGGGACTA-1" ...
.. ..$ : chr [1:132736] "108_AAACCCAAGATACCAA-1" "108_AAACCCAAGGCCCGTT-1" "108_AAACCCACACAGTACT-1" "108_AAACCCACAGGGACTA-1" ...
..@ x : num [1:8925812] 1 0.0811 0.1111 0.0811 0.1429 ...
..@ factors : list()
Now, if I print the object, it definitely is a sparse matrix non binary matrix with the expected dimensions.
> test_sparse
132736 x 132736 sparse Matrix of class "dgTMatrix"
[[ suppressing 59 column names ‘108_AAACCCAAGATACCAA-1’, ‘108_AAACCCAAGGCCCGTT-1’, ‘108_AAACCCACACAGTACT-1’ ... ]]
[[ suppressing 59 column names ‘108_AAACCCAAGATACCAA-1’, ‘108_AAACCCAAGGCCCGTT-1’, ‘108_AAACCCACACAGTACT-1’ ... ]]
108_AAACCCAAGATACCAA-1 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCAAGGCCCGTT-1 . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCACACAGTACT-1 . . 1 . . . . . . . . 0.08108108 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCACAGGGACTA-1 . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCACATAATCGC-1 . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCACATGTTACG-1 . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . 0.3333333 . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCAGTAACTAAG-1 . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCAGTCCCTGTT-1 . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108_AAACCCAAGATACCAA-1 . . . . ......
108_AAACCCAAGGCCCGTT-1 . . . . ......
108_AAACCCACACAGTACT-1 . . . . ......
108_AAACCCACAGGGACTA-1 . . . . ......
108_AAACCCACATAATCGC-1 . . . . ......
108_AAACCCACATGTTACG-1 . . . . ......
108_AAACCCAGTAACTAAG-1 . . . . ......
108_AAACCCAGTCCCTGTT-1 . . . . ......
..............................
........suppressing 132677 columns and 132720 rows in show(); maybe adjust options(max.print=, width=)
..............................
[[ suppressing 59 column names ‘108_AAACCCAAGATACCAA-1’, ‘108_AAACCCAAGGCCCGTT-1’, ‘108_AAACCCACACAGTACT-1’ ... ]]
95_TTTGTTGCATCTCATT-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGCATGAGTAA-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGGTCCCTCAT-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGGTGGATACG-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGGTTGTAAAG-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGTCACCATAG-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGTCAGGAACG-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
95_TTTGTTGTCTACCAGA-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......
The problem is that if I now use this matrix instead of the graph object, inside the buildFromAdjacency
, I still observe the spike in RAM usage.
I therefore looked inside the implementation of buildFromAdjacency
, and I was trying to generate the milo-compatible graph from scratch. I think the problem is at line 72 of the function implementation.
# use igraph if it's square
if(is.square){
if(!is.binary){
bin.x <- as(matrix(as.numeric(x > 0), nrow=nrow(x)), "dgCMatrix")
nn.graph <- graph_from_adjacency_matrix(bin.x, mode="max",
weighted=NULL,
diag=FALSE)
}
The matrix is squared and not binary. here is the sample sparse matrix.
If I run, bin.x <- as(matrix(as.numeric(x > 0), nrow=nrow(x)), "dgCMatrix")
I get the RAM spike.
Do you have any suggestion ?
Yes, the issue is the coercion goes through a dense matrix because the object from Seurat can't be directly cast into a sparse matrix format. This is an issue with using weighted graphs - Milo assumes an unweighted graph, as you can see here because weighted=NULL
.
Because Milo never uses the weights, I would input a sparse binary matrix yourself to save the memory spike, and pass that directly into buildFromAdjacency
.
Thank you very much @MikeDMorgan your suggestion has solved everything! here my solution:
# load the graph object extracted from seurat ref@graphs$RNA_snn
graph_obj <- readRDS("../../sharespace/graph_obj.rds")
str(graph_obj)
# make it a sparse matrix
graph_obj_fix <- as(graph_obj, "dgCMatrix")
# make it a binary sparse matrix
graph_obj_fix2 <- graph_obj_fix
graph_obj_fix2@x <- rep(x = 1,length(graph_obj_fix2@x))
# build the adiacency matrix using binary as T
test <- buildFromAdjacency(graph_obj_fix2, k=10,is.binary = T)
# load the original reference dataset
ref <- readRDS(file = "/beegfs/scratch/ric.cosr/ric.cosr/ric.brunelli/BrunelliS_1810_scRNA_wt_mut_mice/data/R_out/RDS/harmonyHO.rds")
# build the milo object
sce3 <- as.SingleCellExperiment(ref)
milo3 <- Milo(sce3)
# add the graph to the object
miloR::graph(milo3) <- miloR::graph(test)
# build communities
milo3 <- makeNhoods(milo3, prop = 0.1, k = 10, d=30, refined = TRUE)
plotNhoodSizeHist(milo3)
It is all working now! Thank you very much!
Describe the bug I experience a huge amount of RAM usage when trying to add a custom graph object to the milo object using the recommended
miloR::graph(milo) <- miloR::graph(buildFromAdjacency(sobj@graphs$RNA_snn, k=10))
Minimum code example Minimum example to reproduce the case
Session info Output of
sessionInfo()
I fear that if I use a bigger dataset (this was a ~130k cells dataset), the processing might fail. Is there a way to avoid this huge (intermediate, meaning the input is 100Mb and the output is 60Mb, but during the processing, it spikes to 256GB) RAM usage?
I have tried to run the
buildFromAdjacency
usingis.binary = T
. The RAM usage is very moderate, but the problem is downstream. I cannot get good Neighbour communities during themakeNhoods
run.This is the comparison of the Neighbour communities size using the
is.binary = T
parameter.This is much better. Alternatively, do you have any suggestions for building communities of better size using the
is.binary = T
parameter?here the sample graph_obj I have used. Thank you very much for maintaining the tool!