gchen98 / macs

Automatically exported from code.google.com/p/macs
16 stars 6 forks source link

fix Newick number of bases under gene conversion #38

Closed jackkamm closed 6 years ago

jackkamm commented 6 years ago

This fixes a bug in the number of bases printed per Newick tree when there is gene conversion.

To illustrate the bug, do ./macs 100 100000000 -r 1e-3 -s 1515413876 -T -t 1e-3 -c 10 100 > test.txt, (this might take awhile but you can interrupt after a couple seconds). Then on line 11 of the output file you will see the following tree:

NEWICK_TREE:    [4294967294]((52:0.00793543,(3:0.00134986,...

a tree extending for 4e9 bases, longer than the simulated region itself. In actuality the segment length iSeqLength is negative here but it gets cast to a uint and hence becomes a very large integer (this is all happening in algorithm.cpp around line 1450).

The problem is that iSegLength does not take into account the call to checkPendingGeneConversions() in the previous segment which closes the gene conversion and backtracks curPos.

This PR fixes the problem by moving the Newick printing after the call to checkPendingGeneConversions() so that iSegLength will use the correct value of curPos.

Tagging @ea409 who alerted me to this bug.

gchen98 commented 6 years ago

Thanks so much Jack! I'm glad others can finally contribute to this project. It's been too long since I looked at this code!