DyogenIBENS / Agora

Algorithm For Gene Order Reconstruction in Ancestors
Other
69 stars 15 forks source link

Orthology groups correspond to HOGs? #28

Open alexvasilikop opened 11 months ago

alexvasilikop commented 11 months ago

Hello,

Thanks for developing such useful tools. I want to use AGORA with providing the orthologous groups files and not the trees; I have a question regarding these orthology groups. I used Orthofinder for inferring these groups which outputs hierarchical orthogroups (HOGs). For a deep node in the tree the HOGS might only contain a few or even a couple of species. Basically a HOG for a shallow node (e.g. consisting of 2 species) is also a HOG for a deep node (as this shallow node is a child of the deeper node). Should all these groups be provided to all the levels to which they are relevant? Or how do you define the orthologous groups for each node in the species tree, given that some HOGs do not contain all children species of that node?

Thanks Alex

alouis72 commented 10 months ago

Hi Alex, sorry for the late answer. I'm not sure I understand your question. All HOGs should be provide according to the species tree. If you used OrthoFinder to define HOGs, you could be able to run agora with the species tree provided in the Orthofinder results:

Results_XXXX/Phylogenetic_Hierarchical_Orthogroups/ You will have to use the species tree produced by orthofinder for AGORA that should be : Results_XXXX/Species_Tree/SpeciesTree_rooted_node_labels.txt

After that, to simplify treatment you can use a script available here: https://github.com/DyogenIBENS/Agora/tree/dev/src/import/orthofinder_hogs/convert_hogs_sp.py that will format HOGs for AGORA

ex: python ./convert_hogs_sp.py -of_hogs Results_XXXX/Phylogenetic_Hierarchical_Orthogroups -outdir Agora_data/orthoGroups/

Hope this helps, Alexandra

sunriseTM commented 4 months ago

I have the same confusion. If a orthogroup only contain 2 genes from a single species, I noticed Agora will construct a ancesral gene corresponding to the two genes, but the ancestor contain many descedents and only this species contain this 2 genes. As I expected, the 2 genes in the orthogroup is species-specific and tend to be new genes, why Agroa construct a ancestral gene based on them?