LieberInstitute / SPEAQeasy

SPEAQeasy: portable LIBD RNA-seq pipeline using Nextflow. Check http://research.libd.org/SPEAQeasy-example/ for an example on how to use this pipeline and analyze the resulting output files.
http://lieberinstitute.github.io/SPEAQeasy
MIT License
6 stars 4 forks source link
hisat2 nextflow pipeline rna-seq-pipeline rnaseq rstats speaqeasy

SPEAQeasy- a Scalable Pipeline for Expression Analysis and Quantification that is easy to install and share

Summary

SPEAQeasy is a Scalable RNA-seq Pipeline for Expression Analysis and Quantification based on the RNAseq-pipeline. Built on nextflow, and capable of using Docker containers and utilizing common resource managers (e.g. SLURM), this port of the RNAseq-pipeline can be used in different computer environments. It is described in the manuscript here.

The main function of this pipeline is to produce comparable files to those used in recount2, a tool that provides gene, exon, exon-exon junction and base-pair level data.

This pipeline allows researchers to contribute data to the recount2 project even from outside the JHPCE.

Workflow overview

General Workflow

SPEAQeasy takes raw RNA-seq reads and produces analysis-ready R objects, providing a "bridge to the Bioconductor universe", where researchers can utilize the powerful existing set of tools to quickly perform desired analyses.

Beginning with a set of FASTQ files (optionally gzipped), SPEAQeasy ultimately produces RangedSummarizedExperiment objects to store gene, exon, and exon-exon junction counts for an experiment. Optionally, expressed regions data is generated, enabling easy computation of differentially expressed regions (DERs).

Our vignette demonstrates how genotype calls by SPEAQeasy can be coupled with user-provided genotype and phenotype data to easily resolve identity issues that arise during sequencing. We then walk through an example differential expression analysis and explore data visualization options.

Pipeline features

Getting started

The SPEAQeasy documentation website describes the pipeline in full detail. For briefly getting started, check out the quick start guide.

Because SPEAQeasy is based on the nextflow workflow manager, it supports execution on computing clusters managed by SLURM or SGE without any configuration (local execution is also possible). Those with access to docker can very simply use docker containers to manage SPEAQeasy software dependencies, though we provide a script for installing dependencies for users without docker or even root privileges.

Authors

Original Pipeline

Emily Burke, Leonardo Collado-Tores, Andrew Jaffe, BaDoi Phan

Nextflow Port

Nick Eagles, Brianna Barry, Jacob Leonard, Israel Aguilar, Violeta Larios, Everardo Gutierrez

Cite SPEAQeasy

We hope that SPEAQeasy will be useful for your research. Please use the following bibtex information to cite the software and overall approach. Thank you!

@article {Eagles2021,
    author = {Eagles, Nicholas J. and Burke, Emily E. and Leonard, Jacob and Barry, Brianna K. and Stolz, Joshua M. and Huuki, Louise and Phan, BaDoi N. and Larrios Serrato, Violeta and Guti{\'e}rrez-Mill{\'a}n, Everardo and Aguilar-Ordo{\~n}ez, Israel and Jaffe, Andrew E. and Collado-Torres, Leonardo},
    title = {SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor-powered RNA-seq analyses},
    year = {2021},
    doi = {10.1186/s12859-021-04142-3},
    publisher = {Springer Science and Business Media LLC},
    URL = {https://doi.org/10.1186/s12859-021-04142-3},
    journal = {BMC Bioinformatics}
}

Contact