simroux / ClusterGenomes

Archive for ClusterGenomes scripts
7 stars 2 forks source link

ClusterGenomes

This repo archives some "ClusterGenomes" scripts that can be used to cluster viral/phage genomes based on ANI (Average Nucleotide Identity) and AF (Alignment Fraction)

Requirements

MUMMER 4

Notes

In order to perform a large-scale clustering (e.g. ≥ 1000s of sequences), we now recommend anicalc and aniclust, distributed as part of the CheckV package: https://bitbucket.org/berkeleylab/checkv/src/master/. Instructions to perform this clustering are available at the bottom of the CheckV readme ("Supporting code: Rapid genome clustering based on pairwise ANI").