veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
205 stars 69 forks source link

Nodal equivalency across gene phylogenies against the backdrop of a species phylogeny? #1533

Closed VectorFrankenstein closed 1 year ago

VectorFrankenstein commented 1 year ago

Hello devs,

Not an issue but something I am curious about. And most likely a complete shot in the dark.

I am doing research looking into evolution of similar traits at disparate lineages. Specifically the evolution of heteros-porous reproduction out of homos-porous ancestral lineages. See attached image. Outline

(In the above image) Any lineage with one of the far-sitting purple star is a lineage that had the evolution of heteros-porous reproduction out of ancestral lineages with a different form of reproduction.

I have a list of species (current count 111) that fit the phylogenetic outline. Homology inference produced a few thousand families with all species present. So, would it okay to use aBSREL in each gene family to draw inference from nodes in gene trees that are analogous to nodes in the species tree?

So, for all three of my nodes of interest (from the attached image), would it be philosophically sound to run aBSREL across the few thousand homologous families and tabulate data from across families at nodes of interest? This is working under the hypothesis that lineages leading to the evolution of similar traits at disparate positions on the phylogeny might share similar profiles of selection as compared to lineages leading elsewhere.

I apologize if the question is completely nonsensical. Not sure if the kind of operation I am thinking of is technically sound.

spond commented 1 year ago

Dear @RijanDhakal1010,

That's a great question that comes up quite a lot. So much so that we actually developed a specialized analysis for it -- BUSTED-PH. If you want to determine whether a different evolutionary selective pressure is associated with your focal lineages (it this case all sharing heteros-porous reproduction), then you might want to give this a spin.

Let me know if that makes sense.

Best, Sergei

VectorFrankenstein commented 1 year ago

Dear @spond

Thank you for responding so fast!

I am only just starting to learn the theory behind phylogeny models and admittedly feel a bit nervous thinking of what operations are permitted across outputs. So any help is greatly appreciated!

I will get started on BUSTED-PH ASAP.

For now, would you advice against tabulating aBSREL data across families?

Sincerely, Rijan

spond commented 1 year ago

Dear @RijanDhakal1010,

You can definitely tabulate aBSREL estimates, but you should view it an _exploratory) analysis. In other words, if you see some patterns there, they are suggestive, but not what you would call, statistically rigorous. I bet the results will be quite noisy, and patterns might be hard to spot.

Best, Sergei

VectorFrankenstein commented 1 year ago

Dear @spond

Yes, I think am looking at some potentially chaotic string-based filtration given node names across gene phylogeny are going to be assigned arbitrarily.

I will be mindful of the low statistical rigor associated with this approach with aBSREL. An exploratory analysis feels like a decent starting point.

Will let you know how my experience running https://github.com/veg/hyphy-analyses/tree/master/BUSTED-PH goes.

Thank you!

Sincerely, Rijan

spond commented 1 year ago

Dear @RijanDhakal1010,

If you have the same tree across many genes, you can label internal nodes for consistency, like so

((A,B)AB, (C,D)CD) 

HyPhy does have a somewhat consistent node naming: the tree is traversed in-order and for each internal node you get auto-assigned name like NodeN where N is the traversal order (0-based).

Topology T = ((1,2),(3,4),5);
fprintf (stdout, Format (T,1,0));

yields

((1,2)Node1,(3,4)Node4,5)

In-order traversal order here is

  1. 1
  2. (parent of 1,2) -- index 1, so it gets named Node1
  3. 2
  4. 3
  5. parent of (3,4) - index 4, so it gets named Node4
  6. 5

Best, Sergei

VectorFrankenstein commented 1 year ago

Dear @spond

Thank you for letting me know. This helps.

Sincerely, Rijan