Closed U201412486 closed 4 years ago
Thank you. Merry Christmas to you too.
Yes, the groups.txt
file contains both the orthologs and (recent) paralogs identified by OrthoMCL. The file looks something like:
group_1: SpeciesA|geneA1 SpeciesB|geneA1 SpeciesB|geneA2
Here, SpeciesB|geneA1
and SpeciesB|geneA2
are paralogs (they are both very similar genes in SpeciesB and so likely arise from a duplication event). Likewise SpeciesA|geneA1
is likely orthologous to SpeciesB|geneA1
and SpeciesB|geneA2
.
You can find an illustration of a single OrthoMCL group in Figure 3 from the OrthoMCL paper (https://genome.cshlp.org/content/13/9/2178/F3.expansion.html).
So, to separate orthologs from recent paralogs you can look for entries in the groups file where the species portion of the name is the same (representing duplicate genes in the same species).
As for making more specific classifications (inparalogs, etc) I don't know if this is possible from the OrthoMCL results. I suspect you would need to integrate the OrthoMCL results with phylogenetic information about the species you are examining. Unfortunately, this is at the limit of my knowledge so I don't think I can give you a better answer.
I hope this helps.
Thank you for your answer.It helps me.
Hi, Merry Christmas! In the userguide at https://orthomcl.org/common/downloads/software/v2.0/UserGuide.txt, it say The groups.txt file contains the groups created by clustering the pairs with the MCL program. I think it means that The groups.txt include coorthologs,inparalogs and orthologs.So how can I separate coorthologs,inparalogs and orthologs from groups.txt file ? best~ sun,