KrishnaswamyLab / phateR

PHATE dimensionality reduction method implemented in R
GNU General Public License v2.0
77 stars 9 forks source link

Sparse matrix input to PHATE #40

Closed syouligan closed 4 years ago

syouligan commented 4 years ago

Hi there

I was wondering if there is a work flow by which a sparce matrix (or single cell object) can be supplied as PHATE input in R? I am currently working with a single cell dataset of 160000 cells x 16000 genes and converting the data into something that PHATE can handle is not trivial task resource wise. Alternatively, in one of the PHATE tutorials it is mentioned that PCA values can be supplied to PHATE? Is there an example somewhere showing how to do this?

Thanks

scottgigante commented 4 years ago

Hi @syouligan , phateR should work fine with sparse matrices in R. Here's an example:

> library(Matrix)
> library(phateR)
> data <- Matrix(rbinom(10000, 5, 0.05), nrow = 100, ncol=100, sparse=True)
Error in Matrix(rbinom(10000, 5, 0.05), nrow = 100, ncol = 100, sparse = True) : 
  object 'True' not found
> library(Matrix)
> library(phateR)
> data <- Matrix(rbinom(10000, 5, 0.05), nrow = 100, ncol=100, sparse=TRUE)
> data
100 x 100 sparse Matrix of class "dgCMatrix"

[1,] 1 . 1 1 . . . . . . . . 1 . . 1 . . . 2 1 . . . . 1 . . . . . . . . . . 1 . . 2 . . . . 1 1 . . . . . 1 . . . . . . . . ......
[2,] 1 . . . . . . . . . 1 . 2 . 1 . . . 1 . . . . 2 1 1 . 1 . . . . . . 1 . . . 1 . . 1 . . . . . . 2 . . . . . . . . 1 . . ......
[3,] 1 1 . . . . . . . . . . 2 . . . 1 . 2 . . . 1 1 . . . . . . . . . 1 . 1 2 . . . 1 . . . . . . . . . . 1 . . . . . 1 . 1 ......
[4,] . . . . . . . 1 . . 1 1 . 1 1 . . . . 1 1 . . 1 . . . . 1 . 1 . . . 1 1 . . . . . . . . . . . . . . . . . . 1 . . . . 1 ......
[5,] 1 . . . . . . 1 . . . 1 . 1 1 3 . . . . . . . . . . . . . . . . . . . 1 . 1 . . . . . . . . . . . 1 . . . . . . . 1 . . ......
[6,] 1 2 . 1 . . 1 1 . . 1 1 . . 1 2 . . . . 1 . . . . 1 1 . 1 . 1 . . . . . . . . . . . . . . . 1 . . . 1 . . 1 1 . 1 . . . ......
[7,] . . . . . 1 . . . . . . . . . . . . . . . 1 . . . . 1 . 2 . . . . . 1 . . . 1 1 . . . . . . . . . . . 1 . . . . . 1 2 1 ......
[8,] 2 . . . 1 . . . . 1 1 . 1 . . . . . . . 1 . . 1 . . . . . 1 1 . . . . 2 . . . . . . 1 . . 1 1 1 . . 1 . . . 1 . 1 . . 1 ......

 ..............................
 ........suppressing 40 columns and 84 rows in show(); maybe adjust 'options(max.print= *, width = *)'
 ..............................

 [93,] 1 . . . . 2 . . . . 1 . . . 1 . . . 1 . 1 . . . . . . . . . . 1 . . 2 . . . . 1 . 1 . . . . . . . . . . . . . 1 . 1 1 1 ......
 [94,] . . . . . . . . 1 . 2 . . . 1 . 1 . . . . 1 . 2 1 . . . . . . 1 . . . . . 1 1 1 . . . . . . . . . . . . . . 1 1 1 . . 1 ......
 [95,] 1 . . . . 2 1 . . . . . . . . . . . . . 3 . . 2 . . . . 1 . 1 . . . . . . . 1 1 . . . . 2 . . . 1 1 . . . . . . . . . . ......
 [96,] . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . 1 . . . . . . . . . 1 . . . . . . . . 1 . . . ......
 [97,] 1 . 1 . . . . 1 . . . . . . . . . . . 1 1 . . . . . . . 2 1 . . . . . . . . . . . 1 1 1 1 . 1 . . . . . . . . . . . . . ......
 [98,] . . . 1 . . 1 . . . 1 . 1 . . . . . . . 2 1 . 1 . 1 . . . . . . . 1 . 1 . . . . . . . . . . . . . . . . . . . . . . . . ......
 [99,] 1 . . 1 . . 1 . . . . . . . . . . . . . . . . . . . 2 1 1 . 1 . 1 . 1 1 3 . . . . 2 . . 1 1 1 1 . 1 . . . 1 1 . . . . . ......
[100,] . . . . . . . . . . . . . 1 . . . . . . . . . 1 . . . 1 . . . . . . . . . . . . . 1 . . . . . 1 . . . . . . . . . . . . ......
> phate(data)
Calculating PHATE...
  Running PHATE on 100 points and 100 features.
  Calculating graph and diffusion operator...
    Calculating KNN search...
    Calculating affinities...
  Calculating optimal t...
    Automatically selected t = 10
  Calculated optimal t in 0.03 seconds.
  Using 30 significant diffusion components
  Calculating diffusion potential...
  Calculated diffusion potential in 0.03 seconds.
  Calculating metric MDS...
  Calculated metric MDS in 0.08 seconds.
Calculated PHATE in 0.16 seconds.
PHATE embedding with elements
  $embedding : (100, 2)
  $operator : Python PHATE operator
  $params : list with elements (data, knn, decay, t, n.landmark, gamma, ndim, mds.solver, npca, mds.method, knn.dist.method, mds.dist.method)

If the data exists as a sparse matrix in your original data format, then you should be able to do something like this. How are you loading your data?

syouligan commented 4 years ago

Hey @scottgigante I am using a SingleCellExperiment object but ok I just realised my mistake. The requisite sample format of samples X genes meant that the data has to be transposed. The base:t() function I was using cant handle a sparce matrix input so I was having to convert to a dense matrix. But I just realised the Matrix::t() function can. This it wasn't a problem with PHATE at all. Sorry about the confusion and thanks for the help. library("Matrix") library("phateR") class(filtered_exp) [1] "SingleCellExperiment" attr(,"package") [1] "SingleCellExperiment" phate.out <- phate(Matrix::t(Matrix(assay(filtered_exp, "logcounts"), sparse = TRUE)))

scottgigante commented 4 years ago

Glad to hear it's solved!