GreenleafLab / ArchR

ArchR : Analysis of Regulatory Chromatin in R (www.ArchRProject.com)
MIT License
384 stars 137 forks source link

imputation #513

Closed rojinsafavi closed 3 years ago

rojinsafavi commented 3 years ago

Is it possible to retrieve the imputed gene score matrix?

rcorces commented 3 years ago

I've never tried to do this, but I think this is what the imputeMatrix() function is for: https://www.archrproject.com/reference/imputeMatrix.html

@jgranja24 - is that correct?

jgranja24 commented 3 years ago

Hi @rojinsafavi, sorry for any delay. To get the imputed geneScoreMatrix you need to first export the geneScoreMatrix and then impute it as @rcorces mentioned. This converts the matrix to dense format and will be extremely large as a forewarning. First use getMatrixFromProject and apply the imputeMatrix on the matrix within. Itd be something like this but I havent tried it ...

seGS <- getMatrixFromProject(ArchRProj)
matGS <- imputeMatrix(assay(seGS), getImputeWeights(ArchRProj))
brianpenghe commented 2 years ago

I tried imputeMatrix. The returned matrix is in dgeMatrix class and can't be converted to sparse. I had posted the error on stackoverflow. Is there anything wrong with the output of imputeMatrix?

rcorces commented 2 years ago

@brianpenghe a dgCMatrix is a sparse matrix https://www.rdocumentation.org/packages/Matrix/versions/1.3-4/topics/dgCMatrix-class

brianpenghe commented 2 years ago

https://www.rdocumentation.org/packages/Matrix/versions/1.3-4/topics/dgCMatrix-class

yes but imputeMatrix returned a dgeMatrix not dgCMatrix. I tried multiple ways to convert dgeMatrix to dgCMatrix but in vain.

Any ideas?

rcorces commented 2 years ago

https://gallery.rcpp.org/articles/sparse-matrix-coercion/

> mat <- getMatrixFromProject(ArchRProj = projHeme5, useMatrix = "PeakMatrix")
ArchR logging to : ArchRLogs/ArchR-getMatrixFromProject-3971641740b3a6-Date-2021-11-30_Time-08-06-45.log
If there is an issue, please report to github with logFile!
2021-11-30 08:06:48 : Organizing colData, 0.053 mins elapsed.
2021-11-30 08:06:48 : Organizing rowData, 0.053 mins elapsed.
2021-11-30 08:06:48 : Organizing rowRanges, 0.053 mins elapsed.
2021-11-30 08:06:48 : Organizing Assays (1 of 1), 0.053 mins elapsed.
2021-11-30 08:06:48 : Constructing SummarizedExperiment, 0.056 mins elapsed.
2021-11-30 08:06:50 : Finished Matrix Creation, 0.081 mins elapsed.

> impute_mat <- imputeMatrix(mat = assay(mat), imputeWeights = getImputeWeights(projHeme5))
Getting ImputeWeights
ArchR logging to : ArchRLogs/ArchR-imputeMatrix-3971645fe6d0bc-Date-2021-11-30_Time-08-08-39.log
If there is an issue, please report to github with logFile!

> str(impute_mat)
Formal class 'dgeMatrix' [package "Matrix"] with 4 slots
  ..@ x       : num [1:1469993400] 0.044051 0.048376 0.000555 0.001322 0.008135 ...
  ..@ Dim     : int [1:2] 143400 10251
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:10251] "scATAC_BMMC_R1#TTATGTCAGTGATTAG-1" "scATAC_BMMC_R1#AAGATAGTCACCGCGA-1" "scATAC_BMMC_R1#GCATTGAAGATTCCGT-1" "scATAC_BMMC_R1#TATGTTCAGGGTTCCC-1" ...
  ..@ factors : list()

> library(Matrix)
> dgc_imput_mat <- as(impute_mat, "dgCMatrix")
> str(dgc_imput_mat)
brianpenghe commented 2 years ago

Yes I used the as(IMatrix,"dgCMatrix") but got

Error in asMethod(object) : 
  dense_to_Csparse(<LARGE>): cholmod_l_dense_to_sparse failure status=-4

One hypothesis is that my matrix is too big to be converted but the error message didn't tell me the real reason. Since I checked the content there isn't any infinite values. Anyway I saved the matrix into txt and use python to convert it into sparse. Thanks for your patience in helping!

A-legac45 commented 3 weeks ago

We loose the genes names in the matrix unfortunatly; I wanted to incorporate it in my seurat object as an assay.

Do you have any idea how to perform it?