asadprodhan / Average-Nucleotide-Identity-ANI-analysis

Average Nucleotide Identity (ANI) analysis
2 stars 0 forks source link
ani pyani

Average Nucleotide Identity analysis

Asad Prodhan PhD

https://asadprodhan.github.io/


Average Nucleotide Identity (ANI) analysis calculates the percentage of nucleotide identity among the supplied nucleotide sequences. It produces a square matrix of the calculated values. This matrix allows for pairwise comparisons among the nucleotide sequences and helps determine their similarities.


Contents


ANI methods

There are several methods to calculate the ANI:

  • ANIb (based on BLAST algorithm)

  • ANIm (based on MUMmer algorithm)

  • TETRA (based on tetranucleotide signature occurrences)


ANI tools

There are several tools available for ANI analysis (Figueras et al., 2014). For example:


How to run pyani

conda install -c bioconda pyani


Figure 1. Classes


Figure 2. Labels

Note, the first column is the nucleotide sequences names

Second column is the label of the nucleotide sequences


file *.txt
dos2linux *.txt
average_nucleotide_identity.py -i genomes -o output_ANI --labels genomes/labels.txt --classes genomes/classes.txt -g --gmethod seaborn --gformat pdf,png -v -l ba_ANI.log


Results

The final output of the ANI analysis looks like this (Fig. 3):


Figure 3. Results


References

Figueras, M.J., Beaz-Hidalgo, R., Hossain, M.J., Liles, M.R., 2014. Taxonomic Affiliation of New Genomes Should Be Verified Using Average Nucleotide Identity and Multilocus Phylogenetic Analysis. Genome Announc 2, e00927-14. https://doi.org/10.1128/genomeA.00927-14

Richter, M., Rossello´-Mo´ra, R., 2009. Shifting the genomic gold standard for the prokaryotic species definition | Proceedings of the National Academy of Sciences. PNAS 106, 19126–19131. https://doi.org/10.1073/pnas.0906412106