Closed schraderL closed 7 years ago
Dear @schraderL,
The easiest way to obtain these values is by running a post-processing script on .fit
files output by aBSREL (or BUSTED); I can supply those once you give me the definition of what dS and dN mean:)
Best, Sergei
Hi Sergei,
thanks for the reply! The definition from the handbook sounds good, though I wasn't really aware that there are different definitions for calculating dN and dS. I have written a post-processing script to filter the values from the .mglocal.fit files, but if I should rather use the .fit file, I'd be happy to know. In general, I would like to use the dN and dS values which are used to calculate the mean dN/dS for each branch in absREL.
Re: question 2: Thanks for that as well! It's good to know that I don't have to run model selections.
Best, Lukas
Dear @schraderL,
The mean dN/dS can be estimated in two different ways
In neither case does aBSREL actually estimate dS and dN separately. Those may be derived from the MLE values of corresponding rate matrices. If you indicate which values you are interested in, I'll post a simple HBL script to post-process the appropriate file. Also: do you want the output as Newick with embedded values (one for dS and one for dN), or as a CSV file with three columns: node, dS, dN
.
Best, Sergei
I would be curious to try both ways of estimating dN/dS, just so I can try to understand the differences. Apart from that, a CSV with node, dS and dN would be best.
Could you try to explain, why using the dN and dS values from the .mcglocal.fit files isn't appropriate?
After all, these values appear to yield the same dN/dS. For example, in a run of absREL, the dN and dS estimates in .mcglocal.fit are
dS=0.09418547299874075
dN=0.02019853991223301
based on that, the dN/dS for that branch is
dN/dSa=0.2144549
and the mean dN/dS reported in the final output file for the same branch is
dN/dS=0.2144549394841714
So this looks pretty good. I haven't checked all the branches for all the genes, so I don't know if there are instances where the results aren't consitent.
Also, some few branches in some genes show dS values of up to 9999.999 (same for some dN rates). My guess is these values are a product of and necessity for the modelling, because based on my understanding of dS it doesn't make sense to have values larger than 1.
Thanks, Lukas
Dear @schraderL,
Sorry, I was incorrect in my previous post. The MG local model does estimate the synonymous rate, α, and the non-synonymous rate, β separately for each branch, so their ratio will indeed give you dN/dS.
The HBL script for post-processing aBSREL.fit
files is attached.
Best, Sergei
Hi Sergei,
thanks for the script!
best, lukas
Dear @schraderL,
Sorry, I was incorrect in my previous post. The MG local model does estimate the synonymous rate, α, and the non-synonymous rate, β separately for each branch, so their ratio will indeed give you dN/dS.
The HBL script for post-processing
aBSREL.fit
files is attached.Best, Sergei
Dear Dr. Pond, This is Xiaojuan Li,I have calculate dn and ds using FitMG94, which reported trees scaled on expected number of synonymous and non synonymous substitutions per codon. But I also need trees scaled on ds and dn, and dn/ds,thus could you please help me how to extract dS, dN and dN/dS. Thank you for your consideration. Sincerely Xiaojuan Li
Hi and thanks for the great support! I am using Hyphy to test for signatures of positive selection across ~10 eukaryots in several thousand protein-coding genes. My pipeline is pretty straight-forward. I infer orthologs by a RBH-approach, align the sequences, clean the alignments and run absREL using the species tree. In a second run, I would like to run BUSTED and select a single branch as foreground. I now have 2 questions, one specific and one general:
1) I would like to look at dN and dS rates independently, but absREL only reports dN/dS . I am sure reporting dN and dS values is possible by editing the BranchSiteREL.bf using HBL. Unfortunately, I don't know how to do this and am hoping for some advice. I would eventually want to do the same in BUSTED. 2) Should I still run a nucleotide model selection step for each gene and test for recombination before running absREL? In particular the recombination is giving me some headache, because I am using the species tree and not the gene tree for the absREL tests.