awesome-pangenomes
A list of software capable of analyzing mainly eukaryotic genomes for pangenomics. A new section for microbial genomes has also been added, these tools may not scale to large genomes.
:rocket: indicates a popular repository
Important blog posts
- Untangling-graphical-pangenomics Excellent blog by Erik Garrison explaining the differences between rGFA and GFA formats and approaches - important and frequently overlooked
Toolkits
- gaftools Toolkit for GAF (Graph Alignment Format) sorting and manipulation.
- gfakluge Toolkit and c++ API for GFA manipulation
- gfatools Toolkit for GFA parsing and conversion
- gretl Statistics and analysis for GFA files, written in Rust
- odgi Fast toolkit based on odgi format :rocket:
- pgr-tk A PanGenomic Research Took Kit, output of this process is not a GFA file.
- vg Full featured construction, mapping and SNP calling toolkit based on multiple formats. :rocket:
Pangenome construction
- Minigraph Fast method by Heng Li, produces referenceGFA (rGFA) format (not GFA or odgi) :rocket:
- minigraph_cactus and docs Pangenome builder which prioritizes downstream compatibility. Produces GFA and odgi. :rocket:
- PGGB Pangenome Graph Builder, calculates SNPs as part of the pipeline. Produces GFA and odgi. :rocket:
- pangene Pangene constructs a pangenome gene graph from one protein set and many genomes and includes simple but effective visualization :rocket:
- Pantools v3+ Fully featured construction of pangenome graphs
- PSVCP Add PAV to the linear genome to construct a pangenome.
- PHG Practical Haplotype Graph
- PATO R package for pangenome construction
- Chrom_mini_graph Generate and map reads onto a coloured minimizer pangenome graph
- GET_PANGENES Perl scripts used by the Ensembl Plants team for pangenomics
- impg Create an implicit pangenome graph for a homologous target region, then use output bed files to extract sequences for PGGB etc.
- MGRgraph An algorithm to Build a Multi-genome Reference (warning - last updated 2018)
- MEMO MEMO constructs a pangenome and index and allows kmer based conservation analyses and visualization
- poasta Fast, gap-affine sequence-to-graph and partial order aligner and MSA construction
Pangenome pipelines
- nf-core pangenome Paper A scalable Nextflow approach to building pangenomes with PGGB with visualization by odgi. :rocket:
- pangepop A snakemake pipeline to create a pangenome with minigraph-cactus and align reads against it with vg giraffe
Annotating pangenomes
Short read alignment to a pangenome graph
- vg giraffe Faster and more modern alternative to vg map :rocket:
- vg map Original vg mapper (superseded by vg giraffe)
- Hisat2
- Minigraph Construct graphs or align short or long reads to graphs
- Chrom_mini_graph Generate and map reads onto a coloured minimizer pangenome graph
Long read alignment to a pangenome graph
- GraphAligner Fast long read graph aligner :rocket:
- Minigraph Construct graphs or align short or long reads to graphs
- GraphChainer Built on codebase of GraphAligner
- Spades Pathracer Align long reads to genomic graphs
- Minichain Align long reads to pangenomes in GFA or rGFA format
- PanAligner Align long reads to pangenomes
- poasta Fast, gap-affine sequence-to-graph and partial order aligner and MSA construction
SNP callers and genotypers
- vg call SNP caller for pangenomes, with gam or GAF output :rocket:
- vg surject surject to linear reference, then use linear SNP caller like Freebayes, Deepvariant etc :rocket:
- Paragraph A suite of graph-based genotyping tools for short read data
- Pangenie kmer-based SV genotyping using short reads. Intended for human only (in 2023).
- Deepvariant Case study of deep variant SNP calling on vg giraffe aligned bam files
Structural Variation (SV) callers and genotypers
- vg call Call and genotype structural variants on a graph using long and short reads. :rocket:
- GraphTyper A graph SV genotyper (does not call SVs)
- Pangenie kmer-based SV genotyping using short reads. Intended for human only (in 2023).
- SVarp Use long reads to detect structural variants in a GFA format pangenome.
- bubblegun A tool for detecting Bubbles and Superbubbles
- PHI Pangenome-based Haplotype Inference preprint A genotyper using low coverage short or long reads for haploid pangenomes, requires Gurobi license.
Pangenome viewers -interactive
- Bandage Visualize GFA files in an interactive standalone app :rocket:
- SeqTubemap Elegant path visualization for smaller regions of a pangenome from the vg team :rocket:
- MoMI-G Genome graph browser for SVs visualization. User can filter and visualize annotations and inspect SVs with read alignments over the genome graph. :rocket:
- pangene Pangene can visualize one protein set mapped to x genomes to check synteny and presence/absence of genes. :rocket:
- Panagram Plots k-mer conservation
- VAG Visualization of short sequence alignments in a pangenome
- Panache View linearized pangenomes
- Waragraph
- PanGraphViewer Desktop and web versions. Based on cytoscape.js. Can get to chromosome coordinates, allows VCF input.
- Wally View GFA (Work in progress 2023)
- VRPG View rGFA or GFA, written in python and html
- Pantograph is a commercial pangenome graph viewer option
- PGV A web based viewer similar to SeqTubeMap
- Pancat Scripts to filter and visualize GFA files
- gfaestus GFA visualizer, GPU-accelerated using Vulkan
- gfaviz Graphical interactive tool for the visualization of sequence graphs in GFA format
- AGB Interactive assembly graph browser
- graphgenomeviewer Web based viewer for small to medium GFA files
- JBrowse 2 Web based genome browser with synteny views and plugins for multiple-alignments that can be extracted from Cactus graphs (https://github.com/cmdcolin/jbrowse-plugin-mafviewer)
- strangepg A modern GFA viewer and alternative to the Bandage tool
Pangenome viewers -static
- vg view - generates static images
- odgi - generates static images :rocket:
- plotsr - generates static images
Graph validation tools
Pangenome comparison
- junctions Pangenome comparison using elastic-degenerate strings.
- rs-pancat-compare Pairwise pangenome graph comparison by the computation of a segmentation edit distance.
Pangenome tools for microbes
- anvi'o Microbial pangenomics - Annotation, Construction, Visualization and Manipulation (Eukaryote too excepted annotation)
- Roary A well-documented and feature-rich tool which works on Prokka gff files and has an entertaining FAQ.
File formats
Miscellaneous tools
- gfainject Map short alignments in BAM format to a GFA (seems it is not a real aligner but a conversion tool). Output in GAF format.
- GRAFIMO GRAph-based Finding of Individual Motif Occurrences using vg
- rs-gfa A GFA parser in Rust.
- ropebwt3 Can construct and align sequences against huge TB scale references and retrieve haplotypes.
kmer based approaches
Libraries to explore pangenomes
- gfapy implements GFA1 and GFA2 parsing and scalable exploration of graphs in Python
- gfagraphs implements rGFA and GFA1 parsing and editing of graphs in Python
- graphanalyzer a python package to read and analyze the PAF and the GFA files for the graphs.
Other lists of pangenome tools
Contributions
Is something missing? Contributions are welcome, please make PRs to main or write an issue with a link.