LauraMCE / lncRNA_BC

It is a repository that contains information about my master's project. The main topic is lincRNA as biomarkers in breast cancer. The main objective is to identificate lincRNA biomarkers by transcriptome analysis
0 stars 6 forks source link

Modify stadistic parameters to implement GSEA Analysis on DESEq results #15

Closed LauraMCE closed 4 years ago

LauraMCE commented 4 years ago

Hi! I have an issue that I want to discuss with you, hopping that you can give me an adecuate advice.

As you know, I want to know the lincRNA that are differentially expressed in resistant patients, and I also want to know more information about how this lincRNA are involved in resistant processes. To answer that, the simplest way is to look every lincRNA differentially expressed... but I have the problem that most of my differentially expressed lincRNA are only annotated and there is no functional or biological information about them. For that reason, I though it could be a good idea to perform a gene set enrichment analysis, to know in what kind of processes this lincRNA could be involved.

To do that, I was using SeqGSEA package in DE-only analysis mode in R, but when I'm running it, this appears...

Warning message:
In .local(object, ...) :
  in estimateDispersions: sharingMode=='gene-est-only' will cause inflated numbers of false positives unless you have many replicates.

Anyway, if I continue running the analysis, it stops in solving permutation and then appears this:

DEpermNBstat <- DENBStatPermut4GSEA(DEG, permuteMat)
There were 50 or more warnings (use warnings() to see the first 50)

And there is no object created.

In here you can find my count table for differential expression analysis.

And in here is my script.

I want to know if you could help me with GSEA analysis: If you know another package in R for GSEA, or if you know how to avoid the replicates problem... or another way to perform this analysis.

Thanks!

FernandaDiaz12 commented 4 years ago
Hi Laura.

There are two problems with this issue:

  1. Your script runs with DeSEQ and your previous analyses run with DeSEQ two. So probably they won't be comparable. Check this paper.

  2. If the problems are in the number of replicates. Then you have to change your "Metrics for Ranking Genes" and change it since for the default parameter Signal2Noise You must have at least three samples for each phenotype to use this metric.

Check the GSEA documentation.

:)

LauraMCE commented 4 years ago
Hi Laura.

There are two problems with this issue:

  1. Your script runs with DeSEQ and your previous analyses run with DeSEQ two. So probably they won't be comparable. Check this paper.
  2. If the problems are in the number of replicates. Then you have to change your "Metrics for Ranking Genes" and change it since for the default parameter Signal2Noise You must have at least three samples for each phenotype to use this metric.

Check the GSEA documentation.

:)

Hi, @FernandaDiaz12 ! Thank you for the advice. I solved the problem of GSEA Analysis

First, I had to create my own GCT file with normalized counts of each sample and then specify the labels. I runned the GSEA Analysis with ranked option and it worked! The result looks like this:

image

Thank you so much!!!