Open lingyi-owl opened 3 days ago
Hi,
The tANI measure is equivalent to the intergenomic similarity used by VIRIDIC. Unlike ANI, which is calculated in respect to the alignment length, tANI takes into account the full lengths of the genomes being compared. This means that tANI reflects the nucleotide identity between two genome sequences, assuming both genomes have 100% coverage. Therefore, it is only appropriate to use tANI when working with complete genomes.
The formula for tANI is as follows:
tANI = (idAB + idBA) / (lenA + lenB) × 100
where:
When clustering with --tani 0.95
, Vclust will connect genome pairs that have a tANI value ≥ 95%.
Best, Andrzej
Hi, I use vclust script to classify viruses into species and genera following ICTV standards.
The script of assigning viruses into putative species (tANI ≥ 95%) is:
vclust cluster -i ani.tsv -o species.tsv --ids ani.ids.tsv --algorithm complete --metric tani --tani 0.95
What is the minimum query coverage used in this script?
Thanks in advance, Lingyi