pachterlab / sleuth

Differential analysis of RNA-Seq
http://pachterlab.github.io/sleuth
GNU General Public License v3.0
305 stars 95 forks source link

Getting Zero DEGs (but there are definitely some ) #178

Open seyfim opened 6 years ago

seyfim commented 6 years ago

Hello @pimentel @roryk @ttriche @jdidion @vals , I used Kallisto and I am now implementing Sleuth. Unfortunately, I am yielding zero differentially expressed genes (DEGs). This output cannot be correct, as this exact analysis was previously done (by an outside company) and they found 2,235 DEGs. The company used the exact same pipeline that I have implemented. Here is my code: s2c<-data.frame(sample, group) s2c<-dplyr::mutate(s2c, path=kal_dirs) so<-sleuth_prep(s2c, ~group, target_mapping =t2g, extra_bootstrap_summary = TRUE, num_cores = 20) so<-sleuth_fit(so) so<-sleuth_fit(so, ~1, 'reduced')

so <- sleuth_lrt(so, 'reduced', 'full') sleuth_table <- sleuth_results(so, 'reduced:full', 'lrt', show_all = F ) sleuth_significant <- dplyr::filter(sleuth_table, qval <= 0.05) head(sleuth_significant) Here is the output from the last command:

head(sleuth_significant) [1] target_id pval qval test_stat [5] rss degrees_free mean_obs var_obs [9] tech_var sigma_sq smooth_sigma_sq final_sigma_sq

<0 rows> (or 0-length row.names)

I am running R 3.5.0 Here is my session info:

sessionInfo() R version 3.5.0 (2018-04-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS release 6.5 (Final)

Matrix products: default BLAS/LAPACK: /cm/shared/apps/OpenBLAS/current/lib/libopenblas_sandybridgep-r0.2.14.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] bindrcpp_0.2.2 biomaRt_2.36.1 sleuth_0.29.0 httr_1.3.1 [5] gtools_3.5.0 data.table_1.11.4 matrixStats_0.53.1 pheatmap_1.0.10 [9] dplyr_0.7.5 RColorBrewer_1.1-2 viridis_0.5.1 viridisLite_0.3.0 [13] ggplot2_2.2.1 reshape2_1.4.3 tidyr_0.8.1

loaded via a namespace (and not attached): [1] Rcpp_0.12.17 pillar_1.2.3 compiler_3.5.0 [4] plyr_1.8.4 bindr_0.1.1 prettyunits_1.0.2 [7] progress_1.1.2 bitops_1.0-6 tools_3.5.0 [10] digest_0.6.15 bit_1.1-14 memoise_1.1.0 [13] RSQLite_2.1.1 tibble_1.4.2 gtable_0.2.0 [16] rhdf5_2.24.0 pkgconfig_2.0.1 rlang_0.2.1 [19] DBI_1.0.0 curl_3.2 parallel_3.5.0 [22] gridExtra_2.3 stringr_1.3.1 IRanges_2.14.10 [25] S4Vectors_0.18.2 bit64_0.9-7 stats4_3.5.0 [28] grid_3.5.0 tidyselect_0.2.4 Biobase_2.40.0 [31] glue_1.2.0 R6_2.2.2 AnnotationDbi_1.42.1 [34] XML_3.98-1.11 blob_1.1.1 purrr_0.2.5 [37] Rhdf5lib_1.2.1 magrittr_1.5 BiocGenerics_0.26.0 [40] scales_0.5.0 assertthat_0.2.0 colorspace_1.3-2 [43] stringi_1.2.2 RCurl_1.95-4.10 lazyeval_0.2.1 [46] munsell_0.4.3

Any help would be greatly appreciated!!!!

Thanks so much, Marilyn E. Seyfi Bioinformatics Technologist Genomic Medicine Institute Cleveland Clinic-Lerner Research Institute NE5-255 9620 Carnegie Ave Cleveland, OH 44106 work email: seyfim@ccf.org http://www.lerner.ccf.org/gmi/people/

warrenmcg commented 6 years ago

Hi @seyfim,

Just to clarify, the outside company also used kallisto+sleuth, or they used a different pipeline to do the quantification and modeling? If they used kallisto+sleuth, then we need to figure out what happened between their version of the pipeline, and yours. If they used something else, what did they use? Did they do any additional QC or adjusting/normalizing that you did not include? Are there other factors in your experiment that could confound the results and weren't included in the model?

warrenmcg commented 5 years ago

pinging @seyfim, can you clarify whether you figured out what was different between your analysis and the outside company's analysis?