leylabmpi / CoreGenomePrimers

Design clade-specific primers targeting a pan-genome core gene
MIT License
1 stars 1 forks source link
archaea bacteria bioinformatics primer-design primers

CoreGenomePrimers (CGP)

logo

Summary

Pangenome-guided primer design for clade-specific primer generation

Overview of the method:

Pipeline overview

Rule graph:

DAG

Workflow

Install

Snakemake

The pipeline utilizes snakemake. We recommend that you install it via conda or mamba.

Pipeline

git clone --recurse-submodules git@github.com:leylabmpi/CoreGenomePrimers.git

Databases

BLAST (nucl & prot)

Use update_blastdb.pl or another method

BLAST (rRNA)

See the NCBI rRNA databases

You can also download the SSU and LSU databases from ftp:/ftp/ebio/projects/CoreGenomePrimers/

Gene names

You can download the gene names pkl file from ftp:/ftp/ebio/projects/CoreGenomePrimers/

Taxonomy

See taxonkit

You can also download the taxonomy files from ftp:/ftp/ebio/projects/CoreGenomePrimers/

Notes

Make sure to update the file paths to the databases in the config.yaml

Usage instructions

For general instuctions on using snakemake, see the snakemake docs.

Input

Samples file

Set in the config.yaml file via the samples_file: parameter

See Pectobacterium.tsv for an example

A tab-delimited file with the following columns:

Key config params

Example config file: config.yaml

The default parameters in config.yaml are set for qPCR design (but no internal oligo).

The following are parameters that you most likely would want to change for your own needs.

Output

Notes on terminology

Description

Column descriptions

$OUTDIR/core_clusters_info.tsv

$OUTDIR/primers_final_info.tsv

$OUTDIR/nontarget/*_blast*.tsv

$OUTDIR/amplicons.tsv

Misc

Summarizing primer design/filtering logs

See ./utils/primer_log_summary.py