nygenome / lancet

Microassembly based somatic variant caller for NGS data
Other
153 stars 33 forks source link

failed with exit code 139 when run lancet #42

Closed rxw125 closed 5 years ago

rxw125 commented 5 years ago

hello,gnarzisi

I'm very interested in the Bruijn-based lancet . I ran it on gold data for human with each chromosome ,but some tasks failed with code 139。 The stderr is as follows:

……
Thread 7 is 1.0% done.
Thread 1 is 1.0% done.
Thread 8 is 1.0% done.
Thread 2 is 1.0% done.
Thread 6 is 1.0% done.
Thread 5 is 1.0% done.
Thread 3 is 1.0% done.
Thread 4 is 1.0% done.
Segmentation fault (core dumped)

I tried to provide more memory , but it doesn't work. Could you give some suggestions ?

Thank you!

Best wishes!

gnarzisi commented 5 years ago

Can you please define "gold data"? What type of sequencing data? Can you provide the command line used? Hard to guess what went wrong without additional information.

rxw125 commented 5 years ago

hi, thanks for your reply.

The gold data set is bam format for NA12878 with Hiseq PE148 reads。 It has run smoothly with other softwares such as mutect, deepvariant.

The command for chr8 is as follows:

lancet --tumor HG002_1_tumor_realigned.bam --normal HG002_1_normal_realigned.bam --ref hs37d5.fa --num-threads 8 --reg 8 >  outdir/vcf/HG002_1.lancet.chr8.vcf

So far,only analysis of Y chromosome is successful. Others lasted 2-5 days differently and failed with the same error.

gnarzisi commented 5 years ago

Can you please double check that the chromosome labels are the same across the files? FASTA, BAM, and the value passed to "--reg" parameter should all agree (e.g., chr8 or 8).

rxw125 commented 5 years ago

All the chromosome names in bam file and ref fa and fai file are identical,without ”chr”.

I used another data to test lancet. This data encountered same error. But in this round , some chromosomes completed successfully and others did not。 This data comes from the paper "Sahraeian, S. M. E., Liu, R., Lau, B., Podesta, K., Mohiyuddin, M., & Lam, H. Y. (2019). Deep convolutional neural networks for accurate somatic mutation detection. Nature communications, 10(1), 1041."

It also needs to be explained that the first data size is 100X,and the second is 30X。

gnarzisi commented 5 years ago

Thanks for checking the chromosome labels. Both 100x and 30x coverage should work fine.

How was the data aligned? Which version of Lancet are you using? Can you try testing the most recent code by cloning directly from the repository?

git clone git://github.com/nygenome/lancet.git

rxw125 commented 5 years ago

The version I used was cloned directly from the repository in July 3rd. I think this is the latest version. The first data does not have a ideal alignment result。But alignment of the second data which came from published paper is relatively good。 I think there is nothing wrong with input data.

All the possible causes mentioned above I have considered before,but the problem remains。That‘s why I turned to you for help. Thank you for your multi-round responses!

gnarzisi commented 5 years ago

Ok. Thanks. It is possible that one of the reads aligned produces an edge case and "segmentation fault". If you identify and extract the local region (to share with me) where the software fails, I may be able to debug the problem.

gnarzisi commented 5 years ago

This thread has been silent for a long time. So closing this ticket. Feel free to re-open if needed.