xuranw / MuSiC

Multi-subject Single Cell Deconvolution
https://github.com/xuranw/MuSiC
GNU General Public License v3.0
231 stars 92 forks source link

Error: music_prop(): Too few common genes #92

Open hasanalanya opened 2 years ago

hasanalanya commented 2 years ago

Hi all and @xuranw,

I was trying to use music_prop() function to find out estimated cell type proportion, however, it raises an "too few common genes" error. I know that this error can caused by having less than 10% common genes but how I can overcome this issue?

You can see my music_prop usage in below. I appreciate any help and tips to overcome this problem.

Est.prop = music_prop(bulk.eset = bulk.eset, sc.eset = singlecell.eset, 
                                clusters = "orig.ident", 
                                samples = "X", select.ct = NULL, verbose = F) 

Best, Hasan

qnmateen commented 2 years ago

Check for the columns of gene names in single-cell data. It might have gene names as 00R_AC107638.2_ENSMUSG00000111425.1, so remove the 00RAC107638.2 part by gsub. As it may be possible that your count matrix have only ENSMUSG00000111425.1 this part.

CQUyxj commented 1 year ago

Is this problem solved, please @hasanalanya

xoelmb commented 1 year ago

Can you check this? It resorts and subsets your data to common genes. It fixed my error, but I think it's weird this is happening

c.genes <- sort(intersect(rownames(control.mtx), rownames(sce)))

sce <- sce[c.genes,]

control.mtx <- as.matrix(control.mtx[c.genes,])
# case.mtx <- as.matrix(case.mtx[c.genes,])

dim(sce)
dim(control.mtx)
# dim(case.mtx)

This happened in relation to #101

RRCinci-Lu commented 1 year ago

Working through the code, the issue I see appearing is that when Yjg.temp is created by pulling a row from Yjg, R drops all of the column names (creating a matrix of just numbers). When the music.iter.ct command runs, it tries to get names from Yjg.temp (which will be null). That makes common.gene in music.iter.ct null, which takes down the program.

Not sure best way to fix it. Redefining the subset operator to drop=F doesn't solve the problem as a submatrix is created instead of a series of numbers with names.

RRCinci-Lu commented 1 year ago

Huh, converting my matrix to an expressionset and back to a matrix using exprs fixed the problem for me. Not sure why that would work but I won't look a gift horse in the mouth.

KeitaSaeki commented 1 year ago

Huh, converting my matrix to an expressionset and back to a matrix using exprs fixed the problem for me. Not sure why that would work but I won't look a gift horse in the mouth.

It also worked for me... What a funny solution...

For those who encounter the issue, I would crop & leave my script. library(Biobase) object <- new("ExpressionSet", exprs=as.matrix(bulk)) bulk <- exprs(object) Est.mouse.bulk = music_prop.cluster(bulk.mtx = bulk, sc.sce = adata_sce, groups = 'clusterType', group.markers = markers, clusters = 'leiden_anno', samples = 'CellID', clusters.type = clusters.type)

Hope it helps for them.

piyushjo15 commented 10 months ago

Thanks @KeitaSaeki and @RRCinci-Lu ..it also worked for me