statOmics / tradeSeq

TRAjectory-based Differential Expression analysis for SEQuencing data
Other
239 stars 28 forks source link

Association of gene expression with pseudotime #249

Open A-legac45 opened 9 months ago

A-legac45 commented 9 months ago

Hello, I have problem to understand properly the scale of the pseudotime heatmap going from 2 to -2, after selecting genes with a menlogFC >=1 and p.adj <= 0.05

image

The gene expression is the following for example the first square

image

Moreover does there is a way to make the same analysis base on predicted score expression of mutliomic data ?

HectorRDB commented 9 months ago

Hi, In the above heatmap, I assume that each row has been scaled to have mean 0 and variance 1, but I cannot be sure without code. Similarly, the umap without the trajectory is hard to understand.

For multi-omic data, we don't have a tool yet, since there is less consensus on the stat model to use (negative binomial for gene counts). However, you can choose your own model via the "family" argument of FitGam. All downstream tests will work fine.

A-legac45 commented 9 months ago

Thanks for your answer I am not sure about the scaling step I would like to have value on my heatmap that reflect the differential of expression with the higher value where it is express. My code is the following :

I extract a lineage

set.seed(6) fitgam_1 <- extract.DE.fitgam.from.slingshot(SeuratObject, SEURAT_slg, select.trajectory = 1, nknots = 7, integration = FALSE, test = NULL, slot = "data", assays = "RNA")

than perform

fitgam_1_assres <- associationTest(fitgam_1, l2fc = log2(2))

Genes_L1 <- rownames(fitgam_1_assres)[ which(p.adjust(fitgam_1_assres$pvalue, "fdr") <= 0.05) ] length(Genes_L1)

reduce my list of genes in order to be able to plot them

meanLogFC >=2.5

Genes_L1_SUB <- rownames(fitgam_1_assres)[ which(fitgam_1_assres$meanLogFC >= 1 & p.adjust(fitgam_1_assres$pvalue, "fdr") <= 0.05)
]

length(Genes_L1_SUB)

I isolate a list of genes to plot by selecting the one which are different between the two lineage associationTest results :

TEST_L1 <- Reduce(setdiff, list(Genes_L1_SUB, Genes_L3_SUB))

then plot them by using this function:

yhatSmooth <- predictSmooth(fitgam_1, gene = TEST_L1, nPoints = 50, tidy = FALSE) heatSmooth <- pheatmap(t(scale(t(yhatSmooth[, 1:10]))), cluster_cols = FALSE, show_rownames = TRUE, show_colnames = TRUE)

A-legac45 commented 9 months ago

For the multi-omic data, I am not sure to understand properly this part "model via the "family" argument of FitGam". I have multiomic 10x data that means for each cells I have also the predicted expression value according to the opening of the chromatin will it be possible to perform trajectory analysis with this type of data?

A-legac45 commented 9 months ago

Hi, In the above heatmap, I assume that each row has been scaled to have mean 0 and variance 1, but I cannot be sure without code. Similarly, the umap without the trajectory is hard to understand.

For multi-omic data, we don't have a tool yet, since there is less consensus on the stat model to use (negative binomial for gene counts). However, you can choose your own model via the "family" argument of FitGam. All downstream tests will work fine.

Thanks for your answer I am not sure about the scaling step I would like to have value on my heatmap that reflect the differential of expression with the higher value where it is express. My code is the following :

koenvandenberge commented 4 months ago

Hi @A-legac45

Your code indeed confirms what Hector was suggesting: You are scaling the gene expression to zero mean and unit variance. This happens in the following piece of code you provided t(scale(t(yhatSmooth[, 1:10]))), where the scale function scales the fitted gene expression values.

A-legac45 commented 4 months ago

Hi @koenvandenberge

I am confuse because my heatmap do not reflect what I observe for a gene expression how can I handle it properly?

A-legac45 commented 4 months ago
image

For L3

image

And for the first genes LY6C2 wich is red (2) and going to blue (-2) image

koenvandenberge commented 4 months ago

Hi @A-legac45

There used to be a bug in predictSmooth, maybe you can try using the tidy=TRUE argument in predictSmooth and making the heatmap yourself. Alternatively, you could update tradeSeq to the latest version.

A-legac45 commented 4 months ago

So you mean @koenvandenberge I still extract my lineage L3 for example

run fitgam on IT

set.seed(6) fitgam_1 <- extract.DE.fitgam.from.slingshot(SeuratObject, SEURAT_slg, select.trajectory = 1, nknots = 7, integration = FALSE, test = NULL, slot = "data", assays = "RNA")

than perform

fitgam_1_assres <- associationTest(fitgam_1, l2fc = log2(2))

Genes_L1 <- rownames(fitgam_1_assres)[ which(p.adjust(fitgam_1_assres$pvalue, "fdr") <= 0.05) ] length(Genes_L1)

reduce my list of genes in order to be able to plot them

meanLogFC >=2.5

Genes_L1_SUB <- rownames(fitgam_1_assres)[ which(fitgam_1_assres$meanLogFC >= 1 & p.adjust(fitgam_1_assres$pvalue, "fdr") <= 0.05) ]

I isolate a list of genes to plot by selecting the one which are different between the two lineage associationTest results :

TEST_L1 <- Reduce(setdiff, list(Genes_L1_SUB, Genes_L3_SUB))

**AND AT THIS STEP O NEED TO MODIFY MY SCRIPT TO HAVE A TRUE HEATMAP AT LEAST REFLECTING THE REAL EXPRESSION VARIATION OCCURRING DURING THE PSEUDOTIME?**

Then plot them by using this function:

yhatSmooth <- predictSmooth(fitgam_1, gene = TEST_L1, nPoints = 50, tidy = FALSE)

WHAT DOES tidy=TRUE WILL DO?

could you tell me how to make the proper heatmap visualisation ??

heatSmooth <- pheatmap(t(scale(t(yhatSmooth[, 1:10]))), cluster_cols = FALSE, show_rownames = TRUE, show_colnames = TRUE)`

I have another question in my head , is there a may to compare 2 OR 3 lineages according to their position on the umap ?? Like in the square for example

image
A-legac45 commented 4 months ago

thanks for your advise