Closed yimingweng closed 1 year ago
Thank you for attempting to use CSUBST. It appears that the trees were not attached as expected. Could you please share them with us?
Oh, my mistake. Here is the species tree (please note that species "Mleu" is from different family of "Msex", "Bmor", and "Aips" but they share same phenotype).
And here is the gene tree, the four species with same phenotype are clustered together with that phenotype in this gene that has known function related to this phenotype.
The Mleu and Bmor/Msex/Aips lineages were not detected due to their status as entirely sister groups within the gene tree, even though these genes actually did not actually share an evolutionary history. Their apparent grouping might be an artifact of branch attraction, which is a consequence of sequence similarity induced by convergence. Molecular convergence cannot be detected between immediate sisters, so the gene tree topology has to be fixed for a proper analysis. Could you consider using a tree where Mleu and Bmor/Msex/Aips are positioned distantly from each other as input for CSUBST, similar to the species tree?
Also, it is appropriate to include an outgroup, because one of your target lineages (Bmor/Msex/Aips) has its stem branch in the sub-root position, for which ancestral sequences are difficult to estimate.
Dear Fukushima-san,
Thank you so much for the quick and very helpful explanation. I thought the input tree for CSUBST has to be the gene tree of the focal gene? Can I instead using the species tree file as input to run CSUBST (and with outgroup being defined)?. And I wonder if I can still run this analysis if the shared phenotype among Mleu/Bmor/Msex/Aips is actually an ancestral state? About long-branch attraction, I am interested in running csubst site
to see if convernt site can be detected.
Thank you.
YiMing
Gene trees are ideal as input, but in your particular case, it apparently does not work. You can perhaps use a species tree, or improve the gene tree topology by phylogeny reconciliation with the species tree using, e.g., GeneRax. If the focal phenotype is ancestral to the entire tree, running CSUBST may not be quite meaningful.
Dear Fukushima-san,
Thank you very much for the suggestions. I just added two outgroup species and used the rooted species tree to reconcile the gene tree using GeneRax. However, the topology around the the focal species doesn’t change (Mleu, the species of interest, is still sister group of Aips where entire clade is same phenotype but phylogenetic not related as shown in species tree). Of course in such case I can’t do csubst to test this gene on the convergence between Mleu and Aips but I just got another thought. (I hope this question is relevant to other users so that at least it could be somewhat helpful). gene tree here
species tree here
I was thinking a possibility that for Mleu, together with other species with same phenotype in the same gene cluster, are grouping together because they have remained ancestral state (i.e. evolutionary conservatism, because they are closer to the root, with shorter branch lengths), while for those species in the other clade (the clade, e.g. Ccro, Pmal, Tsyl, Dple, Hmel, Lcor, Lphl, Cnem, Cvir, Lcor, Lphl), they have more derived mutations possibly due to positive selection since their common ancestor. If this is the case, is there any way that we can test this idea, like testing positive selections on the clade with more derived mutations using ωC?
If you aim to test for positive selection without specifically focusing on convergence, both HyPhy and PAML are suitable tools. If you're looking to test for convergence between two attracted lineages, using the species tree as an input for CSUBST remains a viable last resort.
Dear Fukushima-san,
Thank you for the great tool. While I was learing this analysis with my data, I wasn't sure how to interpret the result correctly.
Input file: tree file: iqtree, midpoint rooting, from a orthologous protein sequence alignment with 15 species. In-frame codon alignment: PAL2NAL output DNA aligned by the protein sequence alignment (the one I used to make tree file)
question: I got this gene tree with clear pattern of parallel/convergence evolution where one species not beloning to the group (from different family) has this gene clustered with those distantly related group from different family (species tree: in the right family, gene tree: clustered to different family). The gene sequences are in good quality but I got omegaCany2spe and OCNany2spe=1.2 and 1.7, respectively. I wonder if this means no convergence, why the gene tree clustered that species with different family?
Any help will be greatly appreciated! Thank you. YiMing