Closed jthuff closed 6 years ago
Hi Jason,
for me it works. Here is my command line and output (first 10 lines):
build/decode -t 10 -m 0.001 -r 0.001 --nrThreads 8 input_file.txt | head
generating propagation core
generating Hidden Markov Model
#time_boundaries: 0 0.0175601 0.0371906 0.0594458 0.0851376 0.115525 0.152715 0.200662 0.26824 0.383764
1845 0.923997 0.0449414 0.0126274 0.00682254 0.00432511 0.00290508 0.00197339 0.00130416 0.000787414 0.00031665
2845 0.932081 0.0442336 0.011268 0.00546186 0.00307113 0.00180618 0.00105958 0.000597521 0.000307746 0.000112964
3845 0.938086 0.0432216 0.00994658 0.00432126 0.00216526 0.00113318 0.000596643 0.000310684 0.000156939 6.20353e-05
4845 0.942853 0.042056 0.00873138 0.0034061 0.00153475 0.000731541 0.000361938 0.000186368 9.79706e-05 4.09651e-05
5845 0.946801 0.0408116 0.00764213 0.0026866 0.00110221 0.000491958 0.000238837 0.000126386 6.9338e-05 2.98745e-05
6845 0.950168 0.0395306 0.00667954 0.00212696 0.000806742 0.00034714 0.000170744 9.38377e-05 5.30447e-05 2.32379e-05
7845 0.953102 0.0382388 0.00583623 0.00169414 0.000604505 0.000257586 0.000130593 7.42143e-05 4.27462e-05 1.89217e-05
8845 0.955702 0.0369525 0.00510158 0.0013603 0.000465196 0.000200519 0.000105294 6.135e-05 3.57562e-05 1.59407e-05
9845 0.958034 0.0356826 0.00446404 0.001103 0.00036827 0.000162855 8.83304e-05 5.2368e-05 3.07607e-05 1.37851e-05
And as you can see, the leftmost state is the one for the most recent time interval. Stephan
Hi Stephan, Thanks for clarifying the columns. The version I had leaves out the header row (?). I named my input to run with the same command. I don't recall any compilation errors, so maybe it was just an older version:
$ build/decode -t 10 -m 0.001 -r 0.001 --nrThreads 8 input_file.txt | head
generating propagation core
generating Hidden Markov Model
0.0839459 0.275551 0.224518 0.136045 0.0882558 0.0654098 0.0507695 0.0381999 0.0258882 0.0114167
0.0937437 0.336102 0.275062 0.146785 0.0711956 0.0369691 0.0206386 0.0114949 0.00590201 0.0021062
0.096862 0.357485 0.293336 0.148008 0.0607291 0.0240603 0.0104742 0.00520427 0.00273743 0.00110443
0.085065 0.33156 0.284737 0.156447 0.0745276 0.0353737 0.0173716 0.00878245 0.00443559 0.00170066
0.0411786 0.227055 0.227586 0.16078 0.114953 0.0865855 0.0630713 0.0418735 0.0250869 0.0118297
0.020529 0.117677 0.132152 0.118055 0.116024 0.121507 0.123938 0.116359 0.0928101 0.0409486
2.68557e-05 0.000656793 0.00418891 0.0143457 0.0354282 0.0709161 0.121868 0.188707 0.268028 0.295835
0.00643857 0.0435094 0.0923564 0.137901 0.170269 0.181573 0.165589 0.122427 0.0644121 0.0155252
0.000213611 0.00468051 0.0173778 0.0375937 0.0621115 0.0874507 0.112869 0.146009 0.210253 0.321441
7.91957e-06 0.000159891 0.000751299 0.00231555 0.00611549 0.0155559 0.039697 0.101496 0.25711 0.576791
...because I cloned just now from github and recompiled build/decode, and it works perfectly! Thanks
$ build/decode -t 10 -m 0.001 -r 0.001 --nrThreads 8 input_file.txt | head
generating propagation core
generating Hidden Markov Model
#time_boundaries: 0 0.105361 0.223144 0.356675 0.510826 0.693147 0.916291 1.20397 1.60944 2.30259
178 0.0839459 0.275551 0.224518 0.136045 0.0882558 0.0654098 0.0507695 0.0381999 0.0258882 0.0114167
1178 0.0937437 0.336102 0.275062 0.146785 0.0711956 0.0369691 0.0206386 0.0114949 0.00590201 0.0021062
2178 0.096862 0.357485 0.293336 0.148008 0.0607291 0.0240603 0.0104742 0.00520427 0.00273743 0.00110443
3178 0.085065 0.33156 0.284737 0.156447 0.0745276 0.0353737 0.0173716 0.00878245 0.00443559 0.00170066
4178 0.0411786 0.227055 0.227586 0.16078 0.114953 0.0865855 0.0630713 0.0418735 0.0250869 0.0118297
5178 0.020529 0.117677 0.132152 0.118055 0.116024 0.121507 0.123938 0.116359 0.0928101 0.0409486
6178 2.68557e-05 0.000656793 0.00418891 0.0143457 0.0354282 0.0709161 0.121868 0.188707 0.268028 0.295835
7178 0.00643857 0.0435094 0.0923564 0.137901 0.170269 0.181573 0.165589 0.122427 0.0644121 0.0155252
8178 0.000213611 0.00468051 0.0173778 0.0375937 0.0621115 0.0874507 0.112869 0.146009 0.210253 0.321441
And one more quick question about the output. I see now there's also row numbers, which is great, because I'd like to find the precise intervals that overlap a set of genomic elements. Do these row numbers indicate that the chromosome intervals are as follows? 1-178 179-1178 1179-2178 2179-3178 3179-4178 4179-5178 5179-6178 6179-7178 7179-8178
Thanks again
Strictly, these are not actually windows across the chromosome, but estimates of the tMRCA at a particular position, indicated by the first column and spread every 1000 bp starting with the first segregating site. In practice what you are suggesting above would be OK, however, because the tMRCA is not expected to change radically from one position to the next. I would probably take one estimate to be the same for the entire window coming before or after it, as you suggest.
OK, I understand better now I think. I'll try a couple different ways to overlap "windows" for getting the subset of tMRCA values to make sure that the answer is robust to slight differences in the procedure. As you point out, the estimated tMRCA shouldn't change radically over short chromosome distances in most cases, so a variety of procedures will hopefully give similar results.
Best, Jason
Hello,
Thanks Stephan for making such useful software.
I ran the decode program for tMRCAs of each chromosomes segment for data from an unphased diploid.
My first issue is about the time intervals: I noticed that the output matrix seems to be just the genomic intervals (rows) by time intervals (columns). The guide mentions the first row is the header with time boundaries, but that doesn't appear to be the case looking at the output. Please correct me if I'm wrong. Assuming the time boundaries are not there, do you know where I should get them? Are they the same intervals constructed in the msmc (psmc') run if use the same exact -m and -r values in decode as those reported by msmc?
The second issue is that I just want to verify which time interval is the most recent -- it's the leftmost, correct?
Thanks, Jason