pachterlab / sleuth

Differential analysis of RNA-Seq
http://pachterlab.github.io/sleuth
GNU General Public License v3.0
305 stars 95 forks source link

volcano plot with sleuth results #199

Closed suyeonwy closed 6 years ago

suyeonwy commented 6 years ago

hi! I'm doing rna-seq analysis with kallisto&sleuth and trying to make volcano plot with ggplot2

I heard that LRT(likelihood ratio test) is considered a better test than the WT(wald test), but to get beta values for volcano plot I did WT also and just merged beta values with LRT results. I took the R script from following link (https://gist.github.com/jaquol/03f41f57dc6b0eacef101e9920f24d78)

Do you guys think that it is okay to use q-values from LRT and beta values from WT when making volcano plot? I'm not sure whether drawing plot with values from different tests is appropriate.

If this method is not appropriate, could you tell me how to make volcano plots with sleuth results without using sleuth_live()?

Here is my Rscript. ` library(sleuth); library(biomaRt); library(ggplot2);

sample_name <- "liver"

mart <- biomaRt::useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = "sscrofa_gene_ensembl", host = 'ensembl.org') t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id", "ensembl_gene_id", "external_gene_name"), mart = mart) t2g <- dplyr::rename(t2g, target_id = ensembl_transcript_id, ens_gene = ensembl_gene_id, ext_gene = external_gene_name)

metadata <- read.table("dir_path", header = TRUE) metadata$path <- as.character(metadata$path);

so <- sleuth_prep(metadata, extra_bootstrap_summary = TRUE, num_cores = 1, target_mapping = t2g, ~condition) so <- sleuth_fit(so) so <- sleuth_fit(so, ~1, 'reduced') so <- sleuth_lrt(so, 'reduced', 'full') so <- sleuth_wt(so, which_beta = 'conditionwild')

res_lrt <- sleuth_results(so, 'reduced:full', test_type = 'lrt') res_wt <- sleuth_results(so, 'conditionwild') res <- merge(res_lrt, res_wt[, c('target_id', 'b','se_b','mean_obs')], on='target_id', sort = FALSE) res_sig <- dplyr::filter(res, qval <= 0.01); sig <- ifelse(res$qval < 0.01 & abs(res$b) > 5, "Significant", "Not Significant")

pdf(paste0(sample_name,".pdf")) plot <- ggplot(res, aes(b, -log10(qval))) + geom_point(aes(col=sig), na.rm = TRUE) +

scale_color_manual(values = c("light grey","red")) +

ggtitle(paste0("volcano plot ",sample_name)) + theme_bw() + scale_color_brewer(palette = "Set2") + geom_hline(yintercept = -log10(0.01), color = "#990000", linetype = "dashed", size = 0.1) + geom_vline(xintercept = 5, color = "#990000", linetype = "dashed", size = 0.1) + geom_vline(xintercept = -5, color = "#990000", linetype = "dashed", size = 0.1) + xlab("Beta") + ylab("-log10(qvalue)"); plot dev.off() `

warrenmcg commented 6 years ago

Hi @suyeonwy,

I don't necessarily see a problem with mixing the p-values from LRT with the beta-values from the Wald test, as long as you clearly label what you're doing either in the axis labels or in the figure legend.

With regard to doing the volcano plot yourself, if you want a default volcano plot, then I would recommend you use sleuth::plot_volcano. Check out ?sleuth::plot_volcano for details on how to use it.

If you want to use the mixed results, then what you have in your Rscript seems to be sufficient, except I personally would change the x-axis and y-axis labels to specify Wald and LRT, respectively.

Since this doesn't discuss an issue, I'm going to close it. If people want to have a conversation about the utility of the volcano plot with mixed results (LRT and Wald), I would invite them to hop over to the google group. @suyeonwy, if you are having difficulty generating the right kind of plot or getting an error, feel free to reopen this issue.