Open leech1225 opened 2 weeks ago
my config: Ingroup: 2157: Archaea EGP: 1590: Lactiplantibacillus plantarum i want to know the HGT from archaea, so i set it (Lactiplantibacillus plantarum is a species of Bacteria) but after ai, i found the donars are Bacteria too(not Archaea)..
Thanks for your wonderful toolkit, i want to figure out any HGT from Non-Bacteria and lower ranks of Bacteria, how should i set the group.yaml?
Hi @leech1225,
Ingroup is for finding donors outside of this rank, EGP is to exclude this rank in HGT calculations.
Check the following comment for more information.
If you need any further help reply with your species name and I can give more specific examples.
Cheers, Georgios
Thanks for your help captain @GDKO what still puzzled me is that "Ingroup" and "EGP" all means "exclude", but what the difference between them? for example,whats the difference between two config following? 1: ingroup: 2: Bacteria EGP: 1590: Lactiplantibacillus plantarum
2: ingroup: 1590: Lactiplantibacillus plantarum EGP: 2: Bacteria
Glad to talk with you through the comment 🙏
Dear @GDKO,
I'm still confused from reading this thread and the one you mentioned #15. If "EPG" is to exclude this rank in HGTs, then why
If you set exclude to Saccharomyces and ingroup to Saccharomycetaceae you are searching for HGTs present in S.cerevisiae and maybe in other species from the genus Saccharomyces but absent in the other genera of Saccharomycetaceae
Or in your example:
In the following example we have proteins from the nematode Meloidogyne incognita and we want to find HGTs from Non Metazoa species to our species. For that we set Ingroup to Metazoa and EGP to the suborder Tylenchida which our species belongs to, to allow for HGTs that may be present also in other Tylenchida species
It seems to me that HGTs are searched in the EPG group and exclude the Ingroup....
As i tried in several groups.yaml, AvP seems to detect HGT which Exclude:Outside "ingroup" ranks and Under "EGP" ranks
Hi @leech1225 and @lagphase ,
Assuming that your proteins belong to Lactiplantibacillus plantarum
1: ingroup: 2: Bacteria EGP: 1590: Lactiplantibacillus plantarum
This will search for HGTs with donors outside of Bacteria that are present in Lactiplantibacillus plantarum but not in other Bacteria.
2: ingroup: 1590: Lactiplantibacillus plantarum EGP: 2: Bacteria
This will search for HGTs in your species with donors outside of Bacteria that can also be present in other Bacteria. Here, you will not be able to distinguish between vertical or horizontal transmission
Think about it in this way. We want to distinguish between vertical and horizontal transfers.
Let's take as example the following. We have sequenced S. cerevisae that has the following taxonomy
phylum:Ascomycota;class:Saccharomycetes;order:Saccharomycetales;family:Saccharomycetaceae;genus:Saccharomyces
We assume that if a protein in our species is more similar to proteins in other Ascomycota rather than outside of Ascomycota this is an indication for vertical transmission. In that case we set Ingroup to Ascomycota. Now, since S. cerevisae is in the Ascomycota phylum we can set EGP to different ranks depending on our question.
Let's assume a protein was transferred from a bacterial species to S. cervisae.
Let's assume a protein was transferred from a bacterial species to the last common ancestor of Saccharomyces.
In all cases our species should be equal to or inside the EGP rank and the EGP rank should be inside the Ingroup rank, otherwise we will not be able to distinguish between vertical and horizontal transfers. Depending on the question, the user needs to specify Ingroup and EGP.
Hey@GDKO , Thanks for your devotion! I've figure it out!
As you said, you want to find HGTs from "Non Metazoa" species to your species Your example: Ingroup: 33208: Metazoa EGP: 6300: Tylenchida i check that "Metazoa" ranks "kingdom" with "Tylenchida" ranks lower than "kingdom" But the "Ingroup" is for which is the target of the HGT and "EGP" is for which taxonomic groups to exclude from calculations Whats my question: the rank of "Ingroup" is whether "not investigation" or "investigation" My output shows that:"Ingroup" means "not investigation rank" which contradicts your explanation.