stschiff / msmc

Implementation of the multiple sequential markovian coalescent
GNU General Public License v3.0
87 stars 20 forks source link

Segfault on your test file in the guide #33

Closed fishextinction closed 6 years ago

fishextinction commented 6 years ago

I ran the small 16-snp example file from your guide, and get this on OSX:

msmc -t 2 msmc_test.txt 
read 16 SNPs from file msmc_test.txt
estimating scaled mutation rate: 0.00019315
Version:             1.0.1
input files:         ["msmc_test.txt"]
maxIterations:       20
mutationRate:        0.00019315
recombinationRate:   4.82874e-05
subpopLabels:        [0, 0, 0, 0]
timeSegmentPattern:  [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
nrThreads:           2
nrTtotSegments:      40
verbose:             false
outFilePrefix:       
naiveImplementation: false
hmmStrideWidth:      1000
fixedPopSize:        false
fixedRecombination:  false
initialLambdaVec:    []
directedEmissions:   false
skipAmbiguous:       false
indices:             [0, 1, 2, 3]
logging information written to .log
loop information written to .loop.txt
final results written to .final.txt
[1/1] estimating total branchlengthsSegmentation fault: 11
stschiff commented 6 years ago

Not sure what the Segfault is about, but it’s not surprising that MSMC doesn’t run on 16 SNPs. You need way more data, and I can imagine heavy overfitting here and zeros in bad places that cause it to blow up.

I will nevertheless try to find out why it blows up exactly.

Stephan

On 18 Mar 2018, at 01:58, fishextinction notifications@github.com wrote:

I ran the small 16-snp example file from your guide, and get this on OSX:

msmc -t 2 msmc_test.txt read 16 SNPs from file msmc_test.txt estimating scaled mutation rate: 0.00019315 Version: 1.0.1 input files: ["msmc_test.txt"] maxIterations: 20 mutationRate: 0.00019315 recombinationRate: 4.82874e-05 subpopLabels: [0, 0, 0, 0] timeSegmentPattern: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] nrThreads: 2 nrTtotSegments: 40 verbose: false outFilePrefix:
naiveImplementation: false hmmStrideWidth: 1000 fixedPopSize: false fixedRecombination: false initialLambdaVec: [] directedEmissions: false skipAmbiguous: false indices: [0, 1, 2, 3] logging information written to .log loop information written to .loop.txt final results written to .final.txt [1/1] estimating total branchlengthsSegmentation fault: 11 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc/issues/33, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmncdXYXvN_K9i-7xEkufet4Pb28Hks5tfbE3gaJpZM4SvBsJ.

fishextinction commented 6 years ago

Thanks. I would gently suggest that it would be great to include a working test dataset so that users can assess whether their copy of the program is working properly.

stschiff commented 6 years ago

I agree. Stephan

On 19 Mar 2018, at 12:50, fishextinction notifications@github.com wrote:

Thanks. I would gently suggest that it would be great to include a working test dataset so that users can assess whether their copy of the program is working properly.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc/issues/33#issuecomment-374186375, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmi7cu6ocDEWMif_pCxQk86JMREpMks5tf5uSgaJpZM4SvBsJ.

stschiff commented 6 years ago

I will close this issue. I am currently working on a new guide with actually test data, which will hopefully help users.