ANGSD / angsd

Program for analysing NGS data.
230 stars 50 forks source link

segmentation fault #141

Closed ramprasadn closed 5 years ago

ramprasadn commented 6 years ago

Hi all,

I am trying to run angsd on my dataset and I keep running into segmentation fault no matter what version I choose to you. So far, I have tried v0.917, 0.918, and 9.920. Please find the commands and error message below.

Generate folded sfs as follows,

/data/programs/angsd-0.920/angsd -bam angsdout.list -doSaf 1 -trim 0 -fold 1 -setMinDepth 30 -setMaxDepth 150  -anc integ.fa -GL 1 -P 50 -doCounts 1 -out out 
/data/programs/angsd-0.920/misc/realSFS out.saf.idx -P 50 > out.sfs

And then tried to generate thetas with,

/data/programs/angsd-0.917/angsd -bam angsdout.list -out out -trim 0 -setMinDepth 30 -setMaxDepth 150 -doThetas 1 -doSaf 1 -pest out.sfs -anc integ.fa -GL 1 -P 50 -doCounts 1 -fold 1                                                                                                                                                                                                               
        -> angsd version: 0.917 (htslib: 1.7-26-g831747f) build(Mar 15 2018 13:59:33)
        -> Reading fasta: integ.fa  
        -> Parsing 29 number of samples   
        -> Allocated ~ 10 Megabytes to the nodepool
        -> Printing at chr: chr1 pos:583880 chunknumber 800 contains 407 sitesSegmentation fault

/data/programs/angsd-0.918/angsd -bam angsdout.list -out out -trim 0 -setMinDepth 30 -setMaxDepth 150 -doThetas 1 -doSaf 1 -pest out.sfs -anc integ.fa -GL 1 -P 50 -doCounts 1 -fold 1 
        -> angsd version: 0.917 (htslib: 1.7-26-g831747f) build(Mar 15 2018 14:23:49)
        -> Reading fasta: integ.fa
        -> Parsing 29 number of samples 
Segmentation fault

/data/programs/angsd/angsd -bam angsdout.list -out out -trim 0 -setMinDepth 30 -setMaxDepth 150 -doThetas 1 -doSaf 1 -pest out.sfs -anc integ.fa -GL 1 -P 50 -doCounts 1 -fold 1 
        -> angsd version: 0.920 (htslib: 1.6) build(Mar  5 2018 14:38:32)
        -> Reading fasta: integ.fa
        -> Parsing 29 number of samples 
        -> Allocated ~ 10 million nodes to the nodepool, this is not an estimate of the memory usage
        -> Allocated ~ 20 million nodes to the nodepool, this is not an estimate of the memory usage
        -> Allocated ~ 30 million nodes to the nodepool, this is not an estimate of the memory usage
        -> Allocated ~ 40 million nodes to the nodepool, this is not an estimate of the memory usage
Segmentation fault

Has anyone else run into this issue before? Any help would be appreciated.

And let me know if you need anymore details from my side.

Thanks, R

init-js commented 6 years ago

I have also gotten segfaults. But without pinpointing exactly where the segfault occurred, it's difficult to know if they had the same cause.

I was using packaged versions of libhts and angsd (as part of the anaconda package repository) when I was getting the errors. I recompiled libhts and angsd from source, and some of these segfaults went away.

I checked out the repository ngsTools (https://github.com/mfumagalli/ngsTools/ ) which contains fixed versions of htslib and angsd as git submodules -- I assume those exact revisions (of htslib and angsd) are assumed to work better together.

ANGSD commented 6 years ago

There has been an ongoing issue with the thetas sub analysis that caused segfaults but we have been unable to replicate it.

Could you try to disable the threading with -P 1 and rerun the analysis. I would be glad if it still crashes since It would then be possible to recreate.

I doubt it is a mismatch problem between versions of htslib and angsd, it is more likely to be a problem in the thetas subroutine.

Best

clairemerot commented 6 years ago

I am also having a "segmentation fault" with the -dotheta option (using 1 threads or several threads). See the log below, please. Has the problem been solved in the mean time? What do you recommend to do? Thanks Claire

command line: ../../Softwares/angsd/angsd -P 1 -nQueueSize 50 -dosaf 1 -doThetas 1 -GL 2 -doMajorMinor 5 -ref ../genome/kelpfly_genome_AA_pacbio_100kb_plus.fasta -anc ../genome/kelpfly_genome_AA_pacbio_100kb_plus.fasta -fold 1 -remove_bads 1 -minMapQ 30 -minQ 20 -minInd 5 -pest 03-Thetas/BP.sfs -b Filelist/BPbam.filelist -out 03-Thetas/BP

log: -> Printing at chr: 000055F|arrow pos:133688 chunknumber 300 contains 3359 sites -> Printing at chr: 000055F|arrow pos:442712 chunknumber 400 contains 3279 sites /var/spool/slurm/slurmd/job32946/slurm_script: line 16: 9557 Erreur de segmentation ../../Softwares/angsd/angsd -P 1 -nQueueSize $

ANGSD commented 6 years ago

Dear Ramprasadn, Im having problems replicating this thetas issue. Would it be possible for you to rerun it through a debugger after recompiling with the -ggdb flag. Then it should report the exact line that causes the problem.

Thanks for taking your time to report this.

sbi9jd1 commented 6 years ago

Hello, I have the same issue with -doThetas from running the command:

angsd -bam $bamlist -out ${out}_2 -minMapQ 20 -minQ 20 -doThetas 1 -doSaf 1 -pest ${out}.sfs -anc reference/genome.fa -GL 1 -minInd $minInd -fold 1 -P 1

Any ideas how to fix this or get around it? Cheers, Josie

ANGSD commented 6 years ago

move the -P 4 to before the redirection.

See if that solves the segfault otherwise write back.

On Mon, Jun 11, 2018 at 2:24 PM, sbi9jd1 notifications@github.com wrote:

Hello, I have the same issue with the command:

realSFS -tole 1e-12 pop1.saf.idx > pop1.sfs -P 4

I also have it with -P 1, my saf file is quite big 4.8GB but I have tried it with 400GB memory and the same seg fault happens. Any ideas how to fix this or get around it? Cheers, Josie

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/141#issuecomment-396225257, or mute the thread https://github.com/notifications/unsubscribe-auth/AGDo7nD2piSelUuCwp1cq9_xlYhONXigks5t7mF1gaJpZM4SsRu8 .

sbi9jd1 commented 6 years ago

Hello, Sorry I edited that post because the seg fault was not happening with realSFS it was with the doThetas flag. Once I reduced the dataset to 20 individuals I was able to run the command with no errors so I guess it is something to do with the memory required for large datasets as someone else previously found too. Thanks Josie

On Wed, Jun 13, 2018 at 1:33 PM, ANGSD notifications@github.com wrote:

move the -P 4 to before the redirection.

See if that solves the segfault otherwise write back.

On Mon, Jun 11, 2018 at 2:24 PM, sbi9jd1 notifications@github.com wrote:

Hello, I have the same issue with the command:

realSFS -tole 1e-12 pop1.saf.idx > pop1.sfs -P 4

I also have it with -P 1, my saf file is quite big 4.8GB but I have tried it with 400GB memory and the same seg fault happens. Any ideas how to fix this or get around it? Cheers, Josie

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/141#issuecomment-396225257, or mute the thread https://github.com/notifications/unsubscribe- auth/AGDo7nD2piSelUuCwp1cq9_xlYhONXigks5t7mF1gaJpZM4SsRu8 .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/141#issuecomment-396921185, or mute the thread https://github.com/notifications/unsubscribe-auth/Al4_CYmJ9XxLERuIS6PY9y98Yt09X68uks5t8QalgaJpZM4SsRu8 .

-- Josie D'Urban Jackson

Ph.D student Department of Biology & Biochemistry University of Bath Claverton Down Bath BA2 7AY United Kingdom

Cardiff University School of Biosciences, The Sir Martin Evans Building Museum Avenue Cardiff CF103AX United Kingdom

clairemerot commented 6 years ago

Hi, I had the same problem. It worked with 20 samples but not 48. somebody suggested a change in the code. This worked for me (ie I don't get the segmentation fault even with large sample size) but I am unsure whether this change affect also the results.

See issue here https://github.com/ANGSD/angsd/issues/148#issuecomment-385698492

@ANGSD , do you know whether this is a correct way to avoid this segmentation fault or does it change the thetas?

Thanks a lot, Claire

ANGSD commented 5 years ago

There was an issue with the folded thetas that should be resolved in this commit: f68cc5360fc9d5361a186afe32eb8af7ba8f81cd

Im assuming that the above commit solves the issue and im closing it, but feel free to reopen if needed.

Best