phyloacc / PhyloAcc

PhyloAcc a software to detect the changes of conservation of a genomic region
GNU General Public License v3.0
26 stars 12 forks source link

Can i make phyloacc work faster? #23

Closed Yangcao-fz closed 1 year ago

Yangcao-fz commented 3 years ago

Dear author, i followed the instructions and installed phyloacc.before using this application ,i found that i have amout of CNEs to analyse ,in order to saving the time await output data ,i just wondered If it is possible to change some factors to make phyloacc work faster,for example,BURNIN,MCMC,NUM_THREAD and CHAIN.in these factors ,which one is the most relevent one that influence the efficiency of phyloacc? if we can change the speed of analyse,how about the quality of the result compared with result output at normal speed?

xyz111131 commented 3 years ago

Hi,

Thanks for using PhyloAcc. The number of chain can be set to be 1. The shorter MCMC and BURNIN the faster it will be. For NUM_THREAD, if you input multiple elements, PhyloAcc will run parallel for each element.

If you have many CNEs, I would recommend to run PhyloAcc on a computer cluster and run a few CNEs with each run of PhyloAcc.

Yangcao-fz commented 3 years ago

Thanks,now i decide to follow your suggestion to seperate my CNEs into different parts and set a lower MCMC and BURNIN number,there is another question, could you give an evaluation about the relationship between the speed of analyse and quality of the result?

yuzhenpeng commented 3 years ago

By the way, I wonder whether a lower MCMC and BURNIN will affect the result.

Tahnk you.

xyz111131 commented 3 years ago

It depends on the problem. In general, to the acceleration pattern of Z, PhyloAcc doesn't need a long MCMC; it needs to run longer to get the marginal likelihood (or Bayes factors. ). To test out if the MCMC converges, you could set "VERBOSE: 1" for a few elements and PhyloAcc will output the trace of MCMC.

Yangcao-fz commented 3 years ago

ok,thanks.Another problem is that I use part of the data to test R scrips,after uploading all the factors that PlotZPost needs into the environment of R program and running "plotZPost(Z, treeData, target_species=targets, tit=tit, offset=5,cex.score = 2)",it came out part of the image,only with the tittle"logBF1:27 logBF2:16 r1=0.24 r2=3.8" ,after running "plotAlign(k, align, bed, treeData, target_species=targets)",it showed the alignment without species on the left ,"plotZPost_all(treeData, topZ, targets)"only showed the icon which is supposed to show up on the top right corner and without phylogenetic tree, could you give me some suggestion?what are possible reasons for that?

xyz111131 commented 3 years ago

Sorry about that. What are your inputs of "prepare_data" ? Probably your file for "common_name" doesn't match with the tip labels. If that is the case, you can set "common_name = NULL".

Yangcao-fz commented 3 years ago

thank you.i followed your suggestion and succeed.because i m focusing on the mammals,so i d like to see your param*.txt when running phyloacc ,could you please upload these files?

yuzhenpeng commented 3 years ago

hi.zhirui.

I got a lot of accelerated CNEEs in my data. But I have some question. In my result, I found some elements not only accelerated in target species, but also accelerated in background species. I don't konw why. Do you know how to filter like this. I just want to obtain target species accelerate evolution.

Thank you. image

xyz111131 commented 3 years ago

thank you.i followed your suggestion and succeed.because i m focusing on the mammals,so i d like to see your param*.txt when running phyloacc ,could you please upload these files?

Hi, I uploaded a parameter file and the phylogenetic tree file for mammal. You could find these files in https://github.com/xyz111131/PhyloAcc/tree/master/mammal_result. I hope it's helpful!

xyz111131 commented 3 years ago

hi.zhirui.

I got a lot of accelerated CNEEs in my data. But I have some question. In my result, I found some elements not only accelerated in target species, but also accelerated in background species. I don't konw why. Do you know how to filter like this. I just want to obtain target species accelerate evolution.

Thank you. image

Hi yuzhen,

logBF2 is designed to exclude elements that accelerated in background species. In this case, logBF2 is small. One could increase the threshold of logBF2, e.g. filter the elements by logBF2 > 5; directly filter elements by the posterior of Z is also an option.

Thanks!

yuzhenpeng commented 3 years ago

Thank you, zhirui. You are right.

gwct commented 1 year ago

PhyloAcc now facilitates batching of loci and parallel job submission to clusters via snakemake. See the new README for more information.