KlugerLab / ALRA

Imputation method for scRNA-seq based on low-rank approximation
MIT License
73 stars 19 forks source link

ALRAChooseKPlot throws error #24

Open Rohit-Satyam opened 1 year ago

Rohit-Satyam commented 1 year ago

Hello Developers and Maintainers!! @mojaveazure @rcannood @JunZhao1990 @inoue0426 @linqiaozhi

I ran into an issue while trying to use ALRAChooseKPlot function as follows:

> Assays(t)
[1] "RNA"        "integrated"
> DefaultAssay(t)
[1] "RNA"
> imput <- SeuratWrappers::RunALRA(t,assay = "RNA",slot = "data",k.only = T)
Chose rank k = 40, WITHOUT performing ALRA
Warning message:
In asMethod(object) :
  sparse->dense coercion: allocating vector of size 1.9 GiB
> ggouts <- ALRAChooseKPlot(imput)
Error in data.frame(x = 2:length(x = d), y = pvals) : 
  arguments imply differing number of rows: 99, 0

Besides the k value differs from when I directly run ALRA using RunALRA where K value is chosen as 29

> imput <- SeuratWrappers::RunALRA(t,assay = "RNA",slot = "data")
Rank k = 29
Identifying non-zero values
Computing Randomized SVD
Find the 0.001000 quantile of each gene
Thresholding by the most negative value of each gene
Scaling all except for 1433 columns
0.00% of the values became negative in the scaling process and were set to zero
The matrix went from 0.50% nonzero to 17.98% nonzero
Setting default assay as alra
Warning messages:
1: In asMethod(object) :
  sparse->dense coercion: allocating vector of size 1.9 GiB
2: In asMethod(object) :
  sparse->dense coercion: allocating vector of size 1.9 GiB

When using functions from your package the k value suggested is again different.

A_norm <- normalize_data(t(as.matrix(GetAssayData(t,slot = 'count',assay = 'RNA'))))
k_choice <- choose_k(A_norm)

library(ggplot2)
library(gridExtra)
df <- data.frame(x=1:100,y=k_choice$d)
g1<-ggplot(df,aes(x=x,y=y),) + geom_point(size=1)  + geom_line(size=0.5)+ geom_vline(xintercept=k_choice$k)   + theme( axis.title.x=element_blank() ) + scale_x_continuous(breaks=seq(10,100,10)) + ylab('s_i') + ggtitle('Singular values')
df <- data.frame(x=2:100,y=diff(k_choice$d))[3:99,]
g2<-ggplot(df,aes(x=x,y=y),) + geom_point(size=1)  + geom_line(size=0.5)+ geom_vline(xintercept=k_choice$k+1)   + theme(axis.title.x=element_blank() ) + scale_x_continuous(breaks=seq(10,100,10)) + ylab('s_{i} - s_{i-1}') + ggtitle('Singular value spacings')
grid.arrange(g1,g2,nrow=1)

image

Checked with Seurat Normalized data with your package function and again the results vary. The k this time however is similar to the one I get when using SeuratWrappers::RunALRA(t,assay = "RNA",slot = "data",k.only = T)

A_norm <- t(as.matrix(GetAssayData(t,slot = 'data',assay = 'RNA')))
Warning message:
In asMethod(object) :
  sparse->dense coercion: allocating vector of size 1.9 GiB
> k_choice <- choose_k(A_norm)
> k_choice$k
[1] 40
driesdewit commented 11 months ago

Hi, I also get this error when running ALRAChooseKPlot

Error in data.frame(x = 2:length(x = d), y = pvals) : arguments imply differing number of rows: 99, 0

Have you by any chance found a reason or solution?

Thanks in advance!