xzhoulab / iDEA

Differential expression (DE); gene set Enrichment Analysis (GSEA); single cell RNAseq studies (scRNAseq)
GNU General Public License v3.0
32 stars 11 forks source link

How to get SE for LogFC #4

Closed kvshams closed 3 years ago

kvshams commented 4 years ago

Hi, This may not be relevant to iDEA but its a dependency in the input. How to get the SE for log2FC . Packages like EdgeR gives only logfc and pvalue. Is there any way to calculate or fetch for this purpose?

YingMa0107 commented 4 years ago

Hi,

I just noticed this post since the GitHub seems not send us the notification email.

Here is an example to calculate the SE. ie. assume you have the results of res_edgeR which contains the "logfc" and "pvalue" column. pvalue <- res_edgeR$pvalue #### the pvalue column zscore <- qnorm(pvalue/2.0, lower.tail=FALSE) #### convert the pvalue to z-score beta <- res_edgeR$logfc ## logfc you have se_beta <- abs(beta/zscore) ## to approximate the standard error of beta beta_var <- se_beta^2. ## the variance, just update this to make it more clear

Best, Ying

zh-Bian commented 4 years ago

Hi, I noticed that the second column should be the variance of the coefficient for each gene. The variance of the coefficient seems equal to the sd()/mean(), whether is equal to logfc/zscore? Because I calculate the variance of coefficient by sd()/mean(), however, it's wrong in the'' idea <- iDEA.louis(idea)'' (Error in { : task 10 failed - "参数长度为零") 微信截图_20200519111251

So I think whether is related to the method for the variance of coefficient.

Best, Bian

YingMa0107 commented 4 years ago

Hi, Bian, I don't quite follow your question. Did you mean you calculated the standard error as sd()/mean() and then it generates error? I don't know what the sd()/mean() represents here. The standard deviation and standard error are two different conceptions. Could you please try the example code that we provided?

Tushar-87 commented 4 years ago

I have a similar question. I have p value, n for two independent samples (larger than 100) and ratio of means (log folc change). The test used for significance was Wilcoxon rank sum test. How can I calculate standard deviation for each sample?

YingMa0107 commented 4 years ago

Hi @Tushar-87 You can still convert the pvalue to z-score first, and estimate the standard error using log-fc and z-score since for large samples, the test statistics is approximately normally distributed. zscore <- qnorm(pvalue/2.0, lower.tail=FALSE) se <- abs(log-fc/zscore) beta_var <- se^2 ## this and the beta(log-fc) are the required input, just update this to make it more clear.