hahnlab / CAFE5

Version 5 of the CAFE phylogenetics software
Other
109 stars 22 forks source link

Identifying specific expanded genes #136

Closed romseg closed 1 year ago

romseg commented 1 year ago

Dear author,

I am wondering if it's possible to identify the genes that were expanded, and not only the Orthogroup family? I launched cafe5 to identify gene family expansion/contractions with the following command using inputs from Orthofinder:

cafe5 -t SpeciesTree_rooted.ultrametric.tre -i cafe.input.tsv -k 3 -p -o gammak3out

cafe.input.tsv

Desc    Orthogroup  Ahypochondriacus    Athaliana   Bvulgarisssp_vulgaris   Cquinoa Pamilis Soleracea   Stuberosum  Utuberii    Zmays
(null)  OG0000017   15  1   3   25  27  6   97  24  12
(null)  OG0000020   5   4   16  37  59  27  32  7   2
(null)  OG0000021   0   0   0   47  11  34  2   93  0
(null)  OG0000023   4   6   15  43  54  13  11  1   27

After parsing the results Gamma_change.tab file for expanded families in Utuberii only:

FamilyID    Utuberii<3>
OG0000017   2
OG0000039   2
OG0000050   3
OG0000053   2

What I understand is that for example, the Orthogroup OG0000017 have 24 genes belonging to Utuberii species (cafe.input.tsv), and from these genes only 2 genes from OG0000017 were expanded in Utuberii. So cafe5 reports OG0000017 as a family in expansion. And I am wondering how to identify the identity of those two expanded genes?

I am trying to determine Gene Ontology terms enriched in the set of expanded gene families, but it seems to me that it would be more accurate to do this with only the two genes that were expanded, instead of the 24 from Orthogroup OG0000017. Maybe this is not the right approach and I will be happy to know it. Thank you.

Best regards, Rom