veg / flea-pipeline

A pipeline for long-read sequencing data.
Other
7 stars 1 forks source link

Overview

FLEA is a bioinformatics pipeline for analyzing longitudinal sequencing data from the Pacific Biosciences RS-II or Sequel. It currently supports full-length HIV env sequences.

The pipeline takes a set of FASTQ files, one per time point, containing circular consensus sequence (CCS) reads, which can be obtained using the ”Reads of Insert“ protocol on PacBio’s SMRTportal or SMRTanalysis tools. It produces a JSON file containing the following results:

The pipeline logic is implemented in Nextflow. A full description of the pipeline has been submitted for publication. A link to the journal article will be added here when it is available.

See also:

Setup

Dependencies

Install Python scripts

FLEA comes with a flea Python package containing scripts used throughout the pipeline. To install requirements and the scripts themselves (virtualenv recommended):

pip install -r requirements.txt
pip install -r requirements2.txt
python setup.py install

To test:

python setup.py nosetests

Configuration

The default config file is nextflow.config. It is recommended that you make a seperate config file that overrides any options that need to be customized. For more information on Nextflow-specific configuration, see the Nextflow documentation.

At the very least, params.reference_dir and the parameters that depend on it need to point to the various reference files used by the pipeline:

Usage

Write a control file containing a list of FASTQ files, visit codes, and dates, seperated by spaces.

<file> <visit code> <date>
<file> <visit code> <date>
....

Dates must be in 'YYYYMMDD' format.

Run the pipeline with Nextflow:

nextflow path/to/flea.nf -c path/to/custom/config/file \
  --infile path/to/metadata \
  --results_dir path/to/results

The results directory will contain output from lots of pipeline steps. The two files that contain the final results are:

These files can be served with flea-server and visualized with flea-web-app.