Open ekopania opened 2 years ago
Dear Emily,
Before delving into this issue further, did you happen to come across this issue thread yet?
https://github.com/veg/hyphy/issues/1273#issuecomment-767181739
With release 2.5.27 and an update to FitMG94.bf you will now see dS and dN estimates in the output JSON
Best, Steven
Dear Emily,
You can get these quantities from the output of SLAC. The resulting JSON will contain something like what follows, so you can pull out ES
and S
(columns 0 and 2) from the json["MLES"]["content"]["0"]["by-branch"]["AVERAGED"][xx]
table, where xx
is the index for the branch (use the NAMES
entry to map indices to names).
Best, Sergei
"MLE":{
"headers": [
["ES", "Expected synonymous sites"],
["EN", "Expected non-synonymous sites"],
["S", "Inferred synonymous substitutions"],
["N", "Inferred non-synonymous substitutions"],
["P[S]", "Expected proportion of synonymous sites"],
["dS", "Inferred synonymous susbsitution rate"],
["dN", "Inferred non-synonymous susbsitution rate"],
["dN-dS", "Scaled by the length of the tested branches"],
["P [dN/dS > 1]", "Binomial probability that S is no greater than the observed value, with P<sub>s</sub> probability of success"],
["P [dN/dS < 1]", "Binomial probability that S is no less than the observed value, with P<sub>s</sub> probability of success"],
["Total branch length", "The total length of branches contributing to inference at this site, and used to scale dN-dS"]
],
"content":{
"0":{
"by-branch":{
"AVERAGED": [
[131.3168044428377, 387.3226301460821, 18.83333333333334, 62.16666666666666, 0.2531947933093847, 0.1434190651626122, 0.1605035746122553, 0.00992078191838351, 0.3899963120436746, 0.7047685726152259, 0.1902540471972952],
[129.561236545261, 390.2822794214606, 20.16666666666666, 84.83333333333334, 0.2492312254858537, 0.1556535519759541, 0.2173640408657216, 0.03583458478317615, 0.1058743790554714, 0.9314784716730345, 0.2522218225267418],
[133.5384417395946, 389.1083669125233, 9, 31, 0.2555041751502972, 0.06739632335646366, 0.07966932257452385, 0.007126792202360227, 0.4081337126529876, 0.7271272709575386, 0.1004391926327223],
[136.2754101961045, 388.2927197040365, 20.66666666666666, 73.33333333333334, 0.2597859123123749, 0.15165367425368, 0.1888609536362909, 0.02160584742683141, 0.2259173468355006, 0.8400143616880725, 0.212636878745021],
[133.7972276866482, 389.2600981649001, 23, 92, 0.2557984011959369, 0.1719019175334916, 0.2363458274652815, 0.03742185154836802, 0.1008314010665072, 0.9335295689162557, 0.2703513223995095],
[140.2046156142308, 396.7823526713004, 7.5, 18.5, 0.2610950058282979, 0.0534932460471633, 0.04662505747912039, -0.00398827962598081, 0.7092369108736087, 0.4501563363877576, 0.0672975934705809],
[134.751454631459, 392.5139932178286, 0, 3, 0.2555666319139805, 0, 0.007643039616004536, 0.004438226888951172, 0.4125508577792068, 1, 0.003838196760252208],
[136.0881867826784, 397.163888281234, 1, 0, 0.2552042329443682, 0.007348176382105211, 0, -0.004267003135182485, 1, 0.2552042329443682, 0.001708013303307051],
[136.5164955204506, 396.8393878577912, 6, 8, 0.2559576068717268, 0.04395073267245702, 0.02015928923584276, -0.01381542282814194, 0.9568582632702024, 0.1224548551448086, 0.02610813928399086],
[0, 0, 0, 0, null, null, null, null, 1, 1, 0],
[136.570310137287, 396.6299533706336, 0, 1, 0.2561332382673481, 0, 0.002521241755701549, 0.001464056647079198, 0.743866761732652, 1, 0.001848757843820416],
[136.9113838166697, 396.4145310157983, 4.5, 4.5, 0.2567124154461485, 0.03286797543457522, 0.01135175339932395, -0.01249422742563463, 0.9670961780216861, 0.1164171049315259, 0.01814302496171654],
[138.8302359271815, 397.7942542778725, 9, 39, 0.2587102125624791, 0.0648273766870254, 0.09804063176024963, 0.01928656256431083, 0.1684519685777126, 0.9055793519965954, 0.107969146861863],
[143.7819090576958, 396.4599645760192, 42.50000000000001, 70.5, 0.2661435850771913, 0.2955865607748044, 0.1778237559885621, -0.06838353233524548, 0.9959931935958849, 0.007077414287111505, 0.2814556919165797],
[145.8516833705621, 394.8039017195687, 10, 23, 0.2697681988178253, 0.06856280139457305, 0.05825676975284054, -0.005984596318843896, 0.7403138091754105, 0.3964379183315833, 0.06779898779960684],
[145.2407394759065, 395.7881415173817, 21, 35, 0.2684528397250354, 0.1445875315409257, 0.08843114870954999, -0.03260937804725617, 0.9712055026164764, 0.05300488459460351, 0.1200222199260738]
],
"NAMES": [
["PIG"],
["COW"],
["Node3"],
["HORSE"],
["CAT"],
["Node2"],
["RHMONKEY"],
["BABOON"],
["Node9"],
["HUMAN"],
["CHIMP"],
["Node12"],
["Node8"],
["Node1"],
["RAT"],
["MOUSE"]
],
Yes, I did. I am interested in getting the raw numbers that went into calculating dS and dN, so the total # of sites and # of substitutions (both synonymous and nonsynonymous). Thank you!
Dear Steven and Sergei, Thank you for the quick and helpful responses! I will try running SLAC. Emily
Hello, I have a question regarding the output for the FitMG94 model with separate dN/dS calculations for each branch (--type local). Would it be possible to output the raw counts for # synonymous sites, # synonymous substitutions, # nonsynonymous sites, and # nonsynonymous substitutions for each alignment and branch?
I am trying to calculate an average dS for each branch across many gene alignments, and would like to do so by calculating (# synonymous substitutions across all alignments) / (# synonymous sites across all alignments) for each branch. Same for dN and dN/dS.
Thank you! Emily