Hi, I am working with scaffolds that contain strings of Ns and I'm using fastANI value to cluster sequences together.
I did a little test to try to see if/how Ns in scaffolds would affect the ANI. When comparing two sequences that are 100% identical, with the exception of a few strings of Ns, there is 99.84% according to fastANI. If I counted the Ns as mismatched, the nucleotide identity would be 95.6%. While the ANI from fastANI is very close to 100%, the Ns obviously have an influence. Any insight/help on how this work is greatly appreciated!
Hi, I am working with scaffolds that contain strings of Ns and I'm using fastANI value to cluster sequences together. I did a little test to try to see if/how Ns in scaffolds would affect the ANI. When comparing two sequences that are 100% identical, with the exception of a few strings of Ns, there is 99.84% according to fastANI. If I counted the Ns as mismatched, the nucleotide identity would be 95.6%. While the ANI from fastANI is very close to 100%, the Ns obviously have an influence. Any insight/help on how this work is greatly appreciated!
Lauren