XPRESSyourself / XPRESSpipe

An alignment and analysis pipeline for Ribosome Profiling and RNA-seq data
https://xpresspipe.readthedocs.io/en/latest/
GNU General Public License v3.0
12 stars 4 forks source link
bioinformatics genomics ngs pipeline profiling riboseq ribosome rna-seq

XPRESSpipe

An alignment and analysis pipeline for RNAseq data

Release Status

Main Status

Documentation Status DOI


Please refer to the documentation for more in depth details.

Citation:

Berg JA, et. al. (2020). XPRESSyourself: Enhancing, standardizing, and
automating ribosome profiling computational analyses yields improved insight
into data. PLoS Comp Biol. doi: https://doi.org/10.1371/journal.pcbi.1007625

Installation:

Installing from source

The following is a short tutorial showing you how to install XPRESSpipe:
asciicast

NOTE: Previous versions utilized the pip install . command to install. Users of >= v0.6.3 should instead use bash install.sh

QuickStart:

Important Notes:

Basic Starting Input

Naming Conventions

In order for ordered output after alignment (except for generation of a raw counts table), recommended file naming conventions should be followed.

  1. Download your raw sequence data and place in a folder -- this folder should contain all the sequence data and nothing else.
  2. Make sure files follow a pattern naming scheme. For example, if you had 3 genetic backgrounds of ribosome profiling data, the naming scheme would go as follows:
    ExperimentName_BackgroundA_FP.fastq(.qz)
    ExperimentName_BackgroundA_RNA.fastq(.qz)
    ExperimentName_BackgroundB_FP.fastq(.qz)
    ExperimentName_BackgroundB_RNA.fastq(.qz)
    ExperimentName_BackgroundC_FP.fastq(.qz)
    ExperimentName_BackgroundC_RNA.fastq(.qz)
  3. If the sample names are replicates, their sample number needs to be indicated.
  4. If you want the final count table to be in a particular order and the samples ordered that way are not alphabetically, append a letter in front of the sample name to force this ordering.
    ExperimentName_a_WT.fastq(.qz)
    ExperimentName_a_WT.fastq(.qz)
    ExperimentName_b_exType.fastq(.qz)
    ExperimentName_b_exType.fastq(.qz)
  5. If you have replicates:
    ExperimentName_a_WT_1.fastq(.qz)
    ExperimentName_a_WT_1.fastq(.qz)
    ExperimentName_a_WT_2.fastq(.qz)
    ExperimentName_a_WT_2.fastq(.qz)
    ExperimentName_b_exType_1.fastq(.qz)
    ExperimentName_b_exType_1.fastq(.qz)
    ExperimentName_b_exType_2.fastq(.qz)
    ExperimentName_b_exType_2.fastq(.qz)

Running a test dataset:

Updates

Information on updates to the software can be found here.