utia-gc / rnaseq

A Nextflow pipeline for RNA-seq analysis based on utia-gc/ngs.
https://utia-gc.github.io/rnaseq/
MIT License
0 stars 0 forks source link

utia-gc/rnaseq

nf-test Lifecycle:Experimental run with singularity

:book:Full documentation on GitHub Pages:book:

Introduction

utia-gc/rnaseq is a Nextflow pipeline built on utia-gc/ngs for base NGS analysis. While utia-gc/rnaseq can be run on any platform supported by Nextflow, it is developed for use in HPC environments and specifically [ISAAC Next Generation] at the University of Tennessee, Knoxville.

Pipeline overview

flowchart LR
    %% list all the input files
    samplesheet>"Samplesheet"]
    adapter_fasta>"
        Adapter
        Fasta
    "]
    genome_fasta>"
        Genome
        FASTA
    "]
    annotations_gtf>"
        Annotations
        GTF
    "]

    %% list all the internal Nextflow channels
    raw_reads[("
        Raw
        reads
    ")]
    prealign_reads[("
        Prealign
        reads
    ")]
    trim_log[("
        Trim
        log
    ")]
    individual_alignments[("
        Individual
        alignments
    ")]
    merged_alignments[("
        Merged
        alignments
    ")]

    %% list all the Nextflow processes
    fastp{"fastp"}
    cutadapt{"cutadapt"}
    fastqc{"FastQC"}
    seq_depth{"
        Sequencing
        Depth
    "}
    bwa_mem2{"bwa-mem2"}
    STAR{"STAR"}
    samtools_sort{"
        samtools
        sort
        index
    "}
    gatk_MergeSamFiles{"
        gatk
        MergeSamFiles
    "}
    gatk_MarkDuplicates{"
        gatk
        MarkDuplicates
    "}
    samtools_idxstats{"
        samtools
        idxstats
    "}
    samtools_flagstat{"
        samtools
        flagstat
    "}
    samtools_stats{"
        samtools
        stats
    "}

    %% list all subgraphs for Nextflow subworkflows/workflows with options
    subgraph inputs["Input Files"]
    samplesheet
    adapter_fasta
    genome_fasta
    annotations_gtf
    end
    subgraph trim_reads["Trim Reads"]
    fastp
    cutadapt
    end
    subgraph map_reads["Map Reads"]
    bwa_mem2
    STAR
    end
    subgraph publish_reports["Publish Reports"]
    reads_mqc
    alignments_mqc
    full_mqc
    end
    subgraph publish_data["Publish Data"]
    alignments
    end

    %% list all the published reports files
    reads_mqc((("
        Reads
        MultiQC
    ")))
    alignments_mqc((("
        Alignments
        MultiQC
    ")))
    full_mqc((("
        Full MultiQC
    ")))

    %% list all the published data files
    alignments[["
        Alignments
    "]]

    %% reads processing workflow
    samplesheet --> raw_reads
    adapter_fasta --- fastp
    raw_reads --- trim_reads --> prealign_reads

    %% reads QC workflow
    raw_reads --- fastqc --x reads_mqc
    prealign_reads --- fastqc --x reads_mqc
    trim_reads --> trim_log --x reads_mqc
    raw_reads --- seq_depth --x reads_mqc
    prealign_reads --- seq_depth --x reads_mqc

    %% reads mapping workflow
    genome_fasta --- map_reads
    annotations_gtf --- map_reads
    prealign_reads --- map_reads

    %% alignments processing workflow
    map_reads --- samtools_sort --> individual_alignments
    individual_alignments --- gatk_MergeSamFiles --- gatk_MarkDuplicates --> merged_alignments
    merged_alignments --x alignments

    %% alignments QC workflow
    individual_alignments --- samtools_idxstats --x alignments_mqc
    individual_alignments --- samtools_flagstat --x alignments_mqc
    merged_alignments --- samtools_stats --x alignments_mqc

    %% Full MultiQC
    reads_mqc --x full_mqc
    alignments_mqc --x full_mqc

Quick start

Prerequisites

  1. Any POSIX compatible system (e.g. Linux, OS X, etc) with internet access

  2. Nextflow version >= 21.04

  3. Singularity

Get or update utia-gc/rnaseq

  1. Download or update utia-gc/rnaseq:

    nextflow pull utia-gc/rnaseq
  2. Show project info:

    nextflow info utia-gc/rnaseq

Test utia-gc/rnaseq

  1. Check that utia-gc/rnaseq works on your system:

    • -profile nf_test uses preconfigured test parameters to run utia-gc/rnaseq in full on a small test dataset stored in a remote GitHub repository.
    • Because these test files are stored in a remote repository, internet access is required to run the test.
    • For more information, see the profiles section of the nextflow config file.
    nextflow run utia-gc/rnaseq \
      -revision v0.3.2 \
      -profile nf_test

[!IMPORTANT] In accordance with best practices for reproducible analysis, always use the -revision option in nextflow run to specify a tagged and/or released version of the pipeline.

Run utia-gc/rnaseq

TODO