Closed Suirotras closed 1 year ago
Dear @Suirotras,
If you are interested in a single branch, I would say you should keep that branch and its local neighborhood (nearest taxa) intact, and prune more distant clades.
It depends on taxonomic sampling, but one of the strongest predictors of aBSREL
performance is branch length. If you look at the salient figure in the original manuscript you will notice that the power to detect selection climbs as the branch length increases, a certain "saturation" point (~1 sub/site).
So when you prune the tree, focusing on a single branch, be careful to not "merge" this branch with those you delete.
For example, if the focal branch of the "complete" tree is like this (this is the data from tests/data/yokoyama.rh1.cds.mod.1-990.nex
in the HyPhy distribution).
Then a good way to subsample would be like this ("bubbles" here mean : replace clade with one representative sequence).
You do NOT want to delete branches around the focal branch (like in the following picture), because you might subsume their evolutionary content into the "combined" branch, and thus conflate selection that might be happening up or down the tree.
All of the above applies (largely unchanged) to internal test branches, except for those you want to prune the tree in a way that maintains the clade which radiates from the internal node.
Best, Sergei
Hi @spond,
This is very useful. Thanks for the detailed answer!
Best, Jari
Hi,
I have a question about the effect of decreasing tree size on the positive selection detection in one branch (aBSREL).
I have a large tree (and alignment) representing 241 mammalian species. Personally, I am only interested in (episodic) diversifying selection in a single branch of this tree. Because of the large computational load of running aBSREL with this huge tree, I was planning on decreasing the size of this tree and alignment to just 100 mammalian species. I am planning to primarily remove the shortest branches, while keeping the diversity of the tree as large as possible.
My question is, would this change have a large effect on the power to detect diversifying selection in my branch of interest? Or is this dependent on many different aspects of the sequences in the alignment?
Here is what I plan to do: The large unpruned tree (with 241 species):
The pruned tree (with 100 species):
Many thanks for your help and your work on HyPhy!
Sincerely, Jari