veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
210 stars 69 forks source link

Tree for HYPHY #767

Closed DeepakSingla closed 6 years ago

DeepakSingla commented 6 years ago

I am using the Hyphy for positive selection of site. While running the web version of MEME it showed none of site under positive selection while on command line it find some site with positive selection on same dataset. One of the reason i assume is the input tree because on server it compute the tree automatically while in command it ask for tree file which i am giving it a global tree generated from RaXML tool. My question is

  1. Is the input tree given is a particular input alignment specific or a global tree generated from all orthologs at a time.
  2. is it consider the branch length of the tree, if so which is better given the raw tree from RAXML or molecular clock based MCMC tree.

It might be a very basic question but it is important for me to conclude the result. Any feedback is most welcome.

Thanks Deepak

spond commented 6 years ago

Dear @DeepakSingla,

  1. If you are asking about "gene-tree" vs "species-tree", then the generally preferred approach is to use the gene-tree. You should be able to provide a tree to Datamonkey by including it in your alignment.

  2. Branch lengths are always re-estimated by HyPhy and Datamonkey so it doesn't really matter what you input.

Best, Sergei

DeepakSingla commented 6 years ago

Thanks for response. I have analyzed my data for both gene tree and species tree. I have a set of 1820 orthologous genes for which i run hyphy with busted algorithm using species and gene tree.

  1. Gene tree give 976 +vely selected genes at p -value 0.05
  2. Species tree give 977 +vely selected genes at p -value 0.05

Overall common to both gene and species tree is 529 genes under positive selection using Busted algorithm. So there is about 447 genes whose significance level differ in different tree topology. I would like to know how to interpret these results which is significant and which is non-significant.

spond commented 6 years ago

Dear @DeepakSingla,

There are a number of ways this can be further investigated.

  1. How much does the p-value change? It's one thing if you go from 0.04 to 0.06, and another if you go from 0.001 to 1?

  2. Does the tree topology change a lot between species and gene trees on the genes where you have discordant results?

If the answer to 2 is "Yes", then the result is not too surprising.

Best, Sergei

DeepakSingla commented 6 years ago

Thanks for important information. Yes the situation is like no. 2. Then i have to consider the results for gene tree or species tree in that case.

Also for identification of positive site by MEME is it good idea to run on all orthologous or it is better to run only on those with positive selected genes.

DeepakSingla commented 6 years ago

Dear Prof. Sergi I would be thankful if you were able to provide your comments on my previous question.

Thanks