ParBLiSS / FastANI

Fast Whole-Genome Similarity (ANI) Estimation
Apache License 2.0
374 stars 67 forks source link

How does fastANI handle Ns? #122

Open krausfeldtle opened 1 year ago

krausfeldtle commented 1 year ago

Hi, I am working with scaffolds that contain strings of Ns and I'm using fastANI value to cluster sequences together. I did a little test to try to see if/how Ns in scaffolds would affect the ANI. When comparing two sequences that are 100% identical, with the exception of a few strings of Ns, there is 99.84% according to fastANI. If I counted the Ns as mismatched, the nucleotide identity would be 95.6%. While the ANI from fastANI is very close to 100%, the Ns obviously have an influence. Any insight/help on how this work is greatly appreciated!

Lauren