simonhmartin / twisst

Topology weighting by iterative sampling of sub-trees
GNU General Public License v3.0
70 stars 18 forks source link

Recommendations for reducing tip number #45

Open imaa9 opened 4 months ago

imaa9 commented 4 months ago

Hi Simon,

I'm interested in using TWISST to test a few hypotheses about support for different lineage relationships. My samples can be assigned to 12 lineages (including an outgroup) but not all lineages are relevant for all hypotheses. I have more than 4 samples per lineage, except for just one sample in the outgroup.

To reduce the number of tips, is it better to [a] remove taxa that are not relevant to the hypothesis being considered from the dataset or [b] lump lineages that descend from a common ancestor but are not directly relevant to the hypothesis under considering into one tip? I've summarized this in the image below, hopefully this helps clarify my question. I would very much appreciate your thoughts on this, Thanks, Inbar

twisst_tip_specification

simonhmartin commented 1 month ago

Hi Inbar, I'm sorry for missing your question in April. I was on leave and th ealert got buried in other emails.

Thanks for the excellent illustrations. I think I prefer option A, because you can never be certain about what is happening in other parts of the three that could affect signals.

imaa9 commented 1 month ago

Hi Simon,

Thanks for your helpful reply, I'll go with option A and prune the tree to keep just those taxa relevant to the hypothesis being considered. Is it preferable to prune the tree by inferring new gene trees for each hypothesis test (I am using unlinked RAD loci for my "genes" to make gene trees) or by excluding irrelevant samples from the group assignment file?

Thanks a heap, Inbar