GDKO / AvP

Automatic evaluation of HGTs
GNU General Public License v3.0
22 stars 2 forks source link

Evaluating intrakingdom HGT and adding lower toxonomic ranks to classification.txt #14

Closed aldendirks closed 9 months ago

aldendirks commented 10 months ago

Hello,

Thanks again for your great software. I am wondering if your software is recommended for evaluating intrakingdom HGT events. For example, can I determine if there has between HGT between distantly related fungi? Is it sufficient to specificy my fungal order or class as the ingroup rather than all of fungi as the ingroup? I did try this and it seems to work OK... just wondering if you have any tips on this approach. Would it help to modify classification.txt?

Thanks, Alden

aldendirks commented 10 months ago

Related to classificaiton.txt, is it possible to specify lower taxonomic ranks so that tips are annotated with, say, fungus class?

GDKO commented 10 months ago

Hi @aldendirks,

For your first question, you are correct. Specifying your fungal class or order or phylum you allow for HGT identification from fungi outside of the specified rank. During classification these will be identified as HGTs from fungal origin.

If you want to allow further classification, there are two things that you need to modify. Firstly you need to add your new taxonomy ranks to the taxonomy.yaml file (or create a new one and use the --cfg option in avp prepare) and secondly you need to update the classification.txt accordingly.

As an example, let's assume the following two Fungi phyla that we want to explicitly state Basidiomycota and Microsporidia using Ascomycota as our ingroup and Ascobolaceae as exclude, then we will need to update the following files.

groups.yaml

---
Ingroup:
 4890: Ascomycota
EGP:
 5189: Ascobolaceae

taxonomy.yaml

---
Kingdom:
 2: Bacteria
 2157: Archaea
 2759: Eukaryota
 10239: Viruses
 12884: Viroids
 28384: Other
 12908: Unclassified
Other:
 33208: Metazoa
 4751: Fungi
 33090: Viridiplantae
 4762: Oomycota
 5204: Basidiomycota
 6029: Microsporidia

classification.txt

#rank members
Eukaryota   Eukaryota;Viridiplantae;Fungi;Oomycota;Metazoa
Fungi   Basidiomycota;Microsporidia
Prokaryota  Bacteria;Archaea
Viriods Viroids
Viruses Viruses

Hope that helps!

aldendirks commented 9 months ago

Thanks for the help! To be clear, can I add any number of taxa at any rank under "Other" in the taxonomy.yaml file? Does every taxon in taxonomy.yaml need to be in classification.txt? Can I include any fungal taxa at any rank in the classification.txt file?

For lower ranks, does classification.txt need to look like this:

#rank members
Eukaryota   Eukaryota;Viridiplantae;Fungi;Oomycota;Metazoa
Fungi   Basidiomycota;Microsporidia;Agaricomycotina;Tremellomycetes
Prokaryota  Bacteria;Archaea
Viriods Viroids
Viruses Viruses

Or something like this:

#rank members
Eukaryota   Eukaryota;Viridiplantae;Fungi;Oomycota;Metazoa
Fungi   Basidiomycota;Microsporidia
Basidiomycota   Agaricomycotina
Agaricomycotina Tremellomycetes
Prokaryota  Bacteria;Archaea
Viriods Viroids
Viruses Viruses
GDKO commented 9 months ago

Hi @aldendirks,

Thanks for the help! To be clear, can I add any number of taxa at any rank under "Other" in the taxonomy.yaml file?

Correct

Does every taxon in taxonomy.yaml need to be in classification.txt?

If you plan to run avp classify yes.

Can I include any fungal taxa at any rank in the classification.txt file?

You can group the ranks as you see fit for your analyses. For the first example, if nearby proteins belong to a combination of Agaricomycotina and Tremellomycetes the classification will be Fungi_complex.

Since avp classify is a fast algorithm, try different groupings and see what works for you!