ParBLiSS / FastANI

Fast Whole-Genome Similarity (ANI) Estimation
Apache License 2.0
368 stars 66 forks source link

[Question] Recommended ANI cutoffs for grouping metagenomic assembled genomes from multiple assemblies? #84

Open jolespin opened 3 years ago

jolespin commented 3 years ago

I have 88 metagenomes that I am binning. I want to run FastANI to group highly similar bins together. Are there any cutoffs you recommend for this type of task?

I found the following excerpts in your publication that provide some hints but thought it would be better to just ask in case anyone else can benefit:

Our results reveal clear genetic discontinuity, with 99.8% of the total 8 billion genome pairs analyzed conforming to >95% intra-species and <83% inter-species ANI values. This discontinuity is manifested with or without the most frequently sequenced species, and is robust to historic additions in the genome databases. - Abstract

In recent years, the whole-genome average nucleotide identity (ANI) has emerged as a robust method for this task, with organisms belonging to the same species typically showing ≥95% ANI among themselves - Introduction

We leveraged the computational efficiency offered by FastANI to evaluate the distribution of ANI values in a set of over 90,000 genomes, and demonstrate that genetic relatedness discontinuity can be consistently identified among these genomes around 95% ANI. - Discussion

Would you suggest grouping MAGs by ≥95% ANI cutoff?