Closed swamidass closed 6 years ago
Out of the top of my head: Are you sure your input file correctly reflects the heterozygosity in your samples? It seems you have very sparsely distributed segregating sites, and in particular it seems that almost all sites in between these are called homozygous references (third column has very large numbers).
For example, between the second and the third site, there are >17Mb of sequence segment called homozygous reference, with no heterozygosity whatsoever...
I think you may be getting a seg-fault because of some overflow- underflow errors due to extremely long homozygous blocks…
Stephan
On 3 Jan 2018, at 22:25, swamidass notifications@github.com wrote:
Why am I getting a seg fault?
On this basic data file:
1 530673 530672 AAAG 1 2621645 2090972 AAAG 1 19804316 17182671 AAAG 1 22466822 2662506 GAGA 1 22915237 448415 GAGA 1 23048515 133278 GGGA 1 24445215 1396700 AAAG 1 28004741 3559526 AAAG 1 29001118 996377 GGGA 1 31071573 2070455 AAAG 1 34816438 3744865 GAGA 1 35314840 498402 AAAG 1 38538312 3223472 AAAG 1 43088121 4549809 AAAG 1 43796385 708264 GAGA 1 47664747 3868362 AGAA 1 51461120 3796373 AGAA 1 56143120 4682000 GGGA 1 63973759 7830639 GGGA 1 72298048 8324289 AAAG 1 75398992 3100944 AGAA 1 75903941 504949 GAGA 1 80413141 4509200 AAAG 1 83284762 2871621 GAGA 1 83522772 238010 AAAG 1 89044382 5521610 AGAA 1 93987208 4942826 GAGA 1 94955877 968669 AAAG 1 95645609 689732 AAAG I get a seg fault:
msmc/build/msmc --fixedRecombination -o t msmc.data read 29 SNPs from file msmc.data estimating scaled mutation rate: 8.3153e-08 Version: 1.0.1 input files: ["msmc.data"] maxIterations: 20 mutationRate: 8.3153e-08 recombinationRate: 2.07882e-08 subpopLabels: [0, 0, 0, 0] timeSegmentPattern: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] nrThreads: 2 nrTtotSegments: 40 verbose: false outFilePrefix: t naiveImplementation: false hmmStrideWidth: 1000 fixedPopSize: false fixedRecombination: true initialLambdaVec: [] directedEmissions: false skipAmbiguous: false indices: [0, 1, 2, 3] logging information written to t.log loop information written to t.loop.txt final results written to t.final.txt [1/1] estimating total branchlengthsSegmentation fault: 11 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc/issues/31, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmlu_SSLXwyM0uBCZKvml9ig-4VyOks5tG_A3gaJpZM4RSRzV.
That makes sense. This was just some initial data to see if I could get it working. Sounds like I just need to change the input data.
About how long does it take to run on, say, a single Chromosome of the CG data?
A single chromosome of CG data will probably run around an hour or so for four haplotypes.
Why am I getting a seg fault?
On this basic data file:
I get a seg fault: