KrabbenhoftLab / genome_annotation_pipeline

Krabbenhoft Lab genome annotation pipeline using BRAKER and GeMoMa
3 stars 0 forks source link

Krabbenhoft Lab genome annotation pipeline using BRAKER and GeMoMa

[!NOTE] WORK IN PROGRESS. This pipeline was built specifically for the Krabbenhoft Lab's servers and the University at Buffalo HPC cluster. We are in the process of revising this pipeline to work on any Linux system. We plan to distribute this pipeline with a Docker image containing all dependencies in the future. Please stay tuned for updates.

Authors: Dan MacGuigan*, Nate Backenstose, Christopher Osborne

*dmacguig@buffalo.edu

Annotation pipeline flowchart

flowchart

Dependencies

Usage

First, clone this repository.

git clone https://github.com/KrabbenhoftLab/genome_annotation_pipeline.git

Next, rename the cloned repository from genome_annotation_pipeline to something informative. For example:

mv genome_annotation_pipeline MY_SPECIES_genome_annotation

This renamed directory is the ANNOTATION_DIR in your config file and will contain all of your data and results.

To see help options, run ./genome-annotation -h.

Before running the pipeline, be sure to set all of the variables in the config.txt file.

When starting a new genome annotation, your directory structure should look like this:

To perform a step of the pipeline, run ./genome-annotation -s 1 -c config.txt. Pipeline steps should be performed sequentially, except for steps 5 and 6, which can run simultaneously.

Want to rerun part (or all) of the pipeline with different data or settings? Simply copy the ANNOTATION_DIR, rename it, delete old results, and edit the config.txt file (making sure to update the ANNOTATION_DIR variables). Then rerun the pipeline within the new directory. This is the best way to avoid accidentally overwriting your previous annotation files.