genepi / umi-pipeline-nf

Nextflow pipeline to analyze ONT UMI Sequencing data
Mozilla Public License 2.0
3 stars 1 forks source link

Nextflow install with bioconda

Umi-pipeline-nf

Umi-pipeline-nf creates highly accurate single-molecule consensus sequences for unique molecular identifier (UMI)-tagged amplicons from nanopore sequencing data.
The pipeline can be run for the whole fastq_pass folder of your nanopore run and, per default, outputs the aligned consensus sequences of each UMI cluster in bam file. The optional variant calling creates a vcf file for all variants that are found in the consensus sequences. Umi-pipeline-nf orignates from a snakemake-based analysis pipeline (pipeline-umi-amplicon; originally developed by Karst et al, Nat Methods 18:165–169, 2021). We migrated the pipeline to Nextflow and included several optimizations and additional functionalities.

Workflow

Workflow

  1. Input Fastq-files are merged and filtered.
  2. Reads are aligned against a reference genome and filtered to keep only full-length on-target reads.
  3. The flanking UMI sequences of all reads are extracted.
  4. The extracted UMIs are used to cluster the reads.
  5. Per cluster, highly accurate consensus sequences are created.
  6. The consensus sequences are aligned against the reference sequenced.
  7. An optional variant calling step can be performed.
  8. UMI-extraction, clustering, consensus sequence creation, and mapping are repeated.
  9. An optional variant calling step can be performed.

See the output documentation for a detailed overview of the pipeline and its output files.

Main Adaptations

See the usage documentation for all of the available parameters of the pipeline.

Quick Start

  1. Install nextflow.

  2. Download the pipeline and test it on a minimal dataset with a single command.

nextflow run genepi/umi-pipeline-nf -r v0.2.1 -profile test,docker
  1. Start running your own analysis!
    3.1 Download and adapt the config/custom.config with paths to your data (relative and absolute paths possible).
nextflow run genepi/umi-pipeline-nf -r v0.2.1 -c <custom.config> -profile custom,<docker,singularity> 

Citation

If you use the pipeline please cite our Paper:

Amstler, S., Streiter, G., Pfurtscheller, C. et al. Nanopore sequencing with unique molecular identifiers enables accurate mutation analysis and haplotyping in the complex lipoprotein(a) KIV-2 VNTR. Genome Med 16, 117 (2024). https://doi.org/10.1186/s13073-024-01391-8

Credits

The pipeline was written by (@StephanAmstler).
Nextflow template pipeline: EcSeq.
Snakemake-based ONT pipeline for UMI nanopore sequencing analysis: nanoporetech/pipeline-umi-amplicon.
UMI-corrected nanopore sequencing analysis first shown by: SorenKarst/longread_umi.