Open saras224 opened 3 months ago
hi @wwood can you help me with this?
thanks
Hello, Please see my notes below:
which one should I consider the ANI values of the MAGs with the references; fastani_ani or closest_placement_ani? _fastani_ani should return the closest representative based on ANI alone ( the comparison is run against all reps); closest_placementani is only run against one genome only ( the closest genome in the pplacer tree
why does gtdbtk not assigning ANI values to all the MAGs and giving NA instead? is there any cut-off after which the gtdbtk tool does not give the ANI value for the MAGs and gives NA? Some genomes are too novel to have informative ANI values against existing GTDB representatives, This is the case for novel order,class even family. Tk does not return ANI if the values are too low ( <80%) or the user genomes are placed above genus rank in the reference tree
This one is a general question: if the ANI match with the reference is ~70% then to which classification level it is similar (family or order)? We do not recommend using ANI for anything else than species clustering. You can use AAI(or POCP) for genus delineation.
if a MAG is novel and it has to be called as new phyla then what should be the ANI match for it? If a MAG is flag as a new Phylum, ANI should not be taken into account. It requires further investigation using different methods( de novo tree for example)
Hope that helps Regards, Pierre
Hi @pchaumeil I have a confusion regarding the ANI values that the GTDB-tk gives for the classification of the MAGs.
Hope you understand what my confusion is and you would clarify the doubts.
Thanks in Advance Saraswati Awasthi