kokitsuyuzaki / metaSeq

Meta-analysis package for RNA-seq count data
6 stars 1 forks source link

Methodology query #2

Open bpxl22 opened 3 years ago

bpxl22 commented 3 years ago

Hi @kokitsuyuzaki,

Thanks for making this nice R package. Recently, I have been trying to understand the different steps in the method used in metaSeq package as I want to use this tool for an analysis. From the documentation provided, I am having trouble in understanding what exactly does meta.oneside.noiseq function compute as in the original NOISeq we compute probability that a gene is DE given the expression levels in conditions considered and the probability that a gene is non-DE given the same, which is two-sided. Are we computing the probability that a gene is DE and up-regulated or DE and down-regulated? Also, how are the genes which have inconsistent expression (up and down in different studies) handled by this tool? It is not clear from the vignette provided and would be helpful if any help is provided in understanding this so that I can use this tool.

Thanks in advance for your response.

kokitsuyuzaki commented 3 years ago

We have modified a part of the original NOISeq so that it is a one-sided test. Therefore, the probability of NOISeq in both the up-regulated and down-regulated directions is output for each. The probability of NOISeq is the probability of being a DEG, not a p-value, so do not confuse the two. Genes whose expression does not match (goes up and down in different studies) are not reproducible in either the up-regulated or down-regulated direction, and might be estimated as a low probability when integrated by the Fisher method.

bpxl22 commented 3 years ago

Thanks for your response. If I understand the first part correctly, the method produces two one-sided probabilities (of being DEG and up and DEG and down) for each gene in each study. So, we get two different lists of probabilities at the end for each study. Both these lists are combined separately, using the Fisher or Stouffer's method. Is my understanding right or am I missing something?

In the second part, you mention "Genes whose expression does not match (goes up and down in different studies) are not reproducible in either up-regulated or down-regulated direction". I am not quite able to understand what you mean by "not reproducible". Does the method filter out these genes at some step in metaSeq or they are considered but when probabilities are integrated by Fisher or Stouffer's method, these genes will always turn out to be insignificant? Any detail on this would be extremely helpful. Thank you.

kokitsuyuzaki commented 3 years ago

So, we get two different lists of probabilities at the end for each study. Both these lists are combined separately, using the Fisher or Stouffer's method.

Yes, your understanding is correct.

not reproducible

I mean one-tailed NOISeq is direction (up/down)-aware and that's why false positive DEGs that are inconsistent in direction but become significant in meta-analysis do not occur. In widely used DEG methods such as DESeq2 or edgeR, a small p-value has two meanings; up-regulated DEG and down-regulated DEG. That's why even if a small p-value in study A means up-regulated DEG and a small p-value in study B means down-regulated DEG, such genes might be considered significant after integrating by meta-analysis.