veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
200 stars 68 forks source link

How to select positive signals of the foreground branch from the results of aBSREL modle #1714

Open aaannaw opened 1 month ago

aaannaw commented 1 month ago

Hello, professor I am runninghyphy aBSREL --alignment 00.input/evm.model.Chr10.103.cds.fa --tree 00.root.tree --branches FG --output 01.aBSREL/evm.model.Chr10.103.json > 01.aBSREL/evm.model.Chr10.103.log 2>&1 for orthologous genes of six species. Then I got the output file-*.json file for every orthologous gene and I screen positively selected gene for the foreground branch based on Corrected P-value in branch attributes based on thehttps://hyphy.org/resources/json-fields.pdf. Also, the positively selected genes should require the omega value in foreground branch is larger than background branch. In addition, I am confused about aBSREL model and BUSTED model. Could you give me any suggestions? Thanks very much! Looking forward with your reply. Best wishes! Na Wan

spond commented 1 month ago

Dear @aaannaw,

The method you use depends on the question you are asking. If you want to label genes "selected" vs "not selected", the best method we have for this purpose is BUSTED. The main difference between BUSTED and aSBREL, is that BUSTED combines all of your test (FG) branches to decide if a gene is under selection, but aBSREL looks at branches one at a time. So, BUSTED has more power (generally), but can't tell you which branches (could be all, could be some), contribute to the selection signal, but aBSREL has more resolution (but sometimes will fail to find selection if no individual branch is found to be under selection).

Also, the positively selected genes should require the omega value in foreground branch is larger than background branch

This is neither necessary, nor advised. Point estimates of ω should not be used to draw conclusions about selection.

Best, Sergei

aaannaw commented 1 month ago

Hello,professor Thanks for your reply. I really found the shared positively selected genes between foreground branches in aBSREL model is very few when I tried to select the intersection of every foreground branch as common selected genes for all foreground branches. As you said, should I suggested the BUSTED model is the union for all positively selected genes of every foreground branches in aBSREL model. If you said "BUSTED can't tell you which branches (could be all, could be some)", should I trust the results of BUSTED model as the common selected genes for all foreground branches.
Thanks again for your suggestions! Best wishes! Na Wan

spond commented 1 month ago

Dear @aaannaw,

Yes, if the goal is to answer the question

On the set of N foreground branches, which of my genes show evidence of episodic diversifying selection?

then BUSTED (with --branches FG and also probably --error-sink Yes, see issue #1710 for example) would be the recommended option.

If you are testing a lot of genes, it is also a good idea to take BUSTED p-values and apply an False Discovery Rate correction to them

Best, Sergei