JEFworks-Lab / STdeconvolve

Reference-free cell-type deconvolution of multi-cellular spatially resolved transcriptomics data
http://jef.works/STdeconvolve/
98 stars 12 forks source link

How do i view all log2fc gene. #51

Closed M454yuki closed 2 months ago

M454yuki commented 4 months ago

Hi,

I've been following the 10x Visium protocol and it seems to work well, but before i annotate the cell type, i would love to do some sanity check of the log2fc genes expression. I am trying to pull all the log2fc gene expression but after following the instruction, it only allows me to view top 6 gene after applying the following command

"log2topgene2 <- lapply(1:20, function(i) { head(sort(log2(deconGexp[i,]/colMeans(deconGexp[-i,])), decreasing=TRUE)) })"

or

"ps <- lapply(colnames(deconProp), function(celltype) {

celltype <- as.numeric(celltype)

highly expressed in cell-type of interest

highgexp <- names(which(deconGexp[celltype,] > 3))

high log2(fold-change) compared to other deconvolved cell-types

log2fc <- sort(log2(deconGexp[celltype,highgexp]/colMeans(deconGexp[-celltype,highgexp])), decreasing=TRUE) markers <- names(log2fc)[1] ## label just the top gene"

Is there a way not limit the gene and maybe display top 200 genes?

bmill3r commented 4 months ago

Hi @M454yuki,

Thanks for reaching out and using STdeconvolve. A few things to keep in mind: First, the deconvolved cell type expression matrix will only have the genes that were overdispersed when filtering the original count matrix for deconvolution. Second, in the code above, the section “highly expressed in cell type of interest” filters out genes below the expression cutoff used (in this case 3). So this could also reduce the number of genes that would be considered highly expressed for a given cell type. Third, at the end, the code is just showing the top gene for each of the cell types.

Hope this helps, Brendan

M454yuki commented 4 months ago

Hi Brendan,

Thank you for the reply. Yeap, i realise that picking only top > 3 expression in this line: “ highgexp <- names(which(deconGexp[celltype,] > 3)) log2fc <- sort(log2(deconGexp[celltype,highgexp]/colMeans(deconGexp[-celltype,highgexp])), decreasing=TRUE) }) “

can lead to some topic expressing all negative log2fc gene, which can be an issue when annotating the topic. I’ve manage to changed that rule out to do a log2fc on all gene instead.

"log2topgene4 <- lapply(1:20, function(i) { sort(log2(deconGexp[i,]/colMeans(deconGexp[-i,])), decreasing=TRUE) [1:200] })”

Where 1:20 is base on the number of topic.

And i picked top 200 gene from there to do the annotation.

The reason why i wanted to do this was to do a check with another annotation tool that our collaboration have developed. "https://www.immunesinglecell.org https://www.immunesinglecell.org/” So far the annotation looks sound.

Otsuka Masayuki, PhD Research Fellow Translational Immunology institute (TII), Singhealth-DukeNUS 20 College Road, The Academia, Level 8 Discovery Tower, Singapore 169856

On 26 Feb 2024, at 4:16 AM, bmill3r @.***> wrote:

Hi @M454yuki https://github.com/M454yuki,

Thanks for reaching out and using STdeconvolve. A few things to keep in mind: First, the deconvolved cell type expression matrix will only have the genes that were overdispersed when filtering the original count matrix for deconvolution. Second, in the code above, the section “highly expressed in cell type of interest” filters out genes below the expression cutoff used (in this case 3). So this could also reduce the number of genes that would be considered highly expressed for a given cell type. Third, at the end, the code is just showing the top gene for each of the cell types.

Hope this helps, Brendan

— Reply to this email directly, view it on GitHub https://github.com/JEFworks-Lab/STdeconvolve/issues/51#issuecomment-1963049246, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGI63A2FQLF7NEFLULU6GQ3YVOL3VAVCNFSM6AAAAABDQKROFKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRTGA2DSMRUGY. You are receiving this because you were mentioned.