dellytools / delly

DELLY2: Structural variant discovery by integrated paired-end and split-read analysis
BSD 3-Clause "New" or "Revised" License
426 stars 136 forks source link

delly coredump at torali::getLibraryParams #63

Closed wangyugui closed 7 years ago

wangyugui commented 7 years ago

Hi.

delly coredump at torali::getLibraryParams

delly version: 0.7.6(with last source from git)

#delly call -t DEL -x /usr/hpc-bio/delly/human.hg38.excl.tsv -o t1.bcf -g /usr/bio-ref/10X-ref/refdata-GRCh38.meta/fasta/genome.fa /biowrk/10X.datasets/longranger.wgs/HCC1954T/outs/phased_possorted_bam.bam /biowrk/10X.datasets/longranger.wgs/HCC1954N/outs/phased_possorted_bam.bam
[2016-Nov-21 23:41:53] delly call -t DEL -x /usr/hpc-bio/delly/human.hg38.excl.tsv -o t1.bcf -g /usr/bio-ref/10X-ref/refdata-GRCh38.meta/fasta/genome.fa /biowrk/10X.datasets/longranger.wgs/HCC1954T/outs/phased_possorted_bam.bam /biowrk/10X.datasets/longranger.wgs/HCC1954N/outs/phased_possorted_bam.bam
#echo $?
11

# gdb /usr/hpc-bio/delly/delly core.84798
(gdb) where
#0  0x000000000046371f in void torali::getLibraryParams<torali::Config, std::vector<boost::icl::interval_set<unsigned int, std::less, boost::icl::discrete_interval<unsigned int, std::less>, std::allocator>, std::allocator<boost::icl::interval_set<unsigned int, std::less, boost::icl::discrete_interval<unsigned int, std::less>, std::allocator> > >, std::vector<boost::unordered::unordered_map<std::string, torali::LibraryInfo, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, torali::LibraryInfo> > >, std::allocator<boost::unordered::unordered_map<std::string, torali::LibraryInfo, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, torali::LibraryInfo> > > > > >(torali::Config const&, std::vector<boost::icl::interval_set<unsigned int, std::less, boost::icl::discrete_interval<unsigned int, std::less>, std::allocator>, std::allocator<boost::icl::interval_set<unsigned int, std::less, boost::icl::discrete_interval<unsigned int, std::less>, std::allocator> > > const&, std::vector<boost::unordered::unordered_map<std::string, torali::LibraryInfo, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, torali::LibraryInfo> > >, std::allocator<boost::unordered::unordered_map<std::string, torali::LibraryInfo, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, torali::LibraryInfo> > > > >&) ()
#1  0x00000000004c6ab0 in int torali::dellyRun<torali::SVType<torali::DeletionTag> >(torali::Config const&, torali::SVType<torali::DeletionTag>) ()
#2  0x000000000040c43d in torali::delly(int, char**) ()
#3  0x00000000004023fa in main ()

# samtools view -H /biowrk/10X.datasets/longranger.wgs/HCC1954N/outs/phased_possorted_bam.bam |grep "@RG"
@RG     ID:HCC1954N:HCC1954N:1:HMTV7CCXX:1      SM:HCC1954N     LB:HCC1954N.1   PU:HCC1954N:HCC1954N:1:HMTV7CCXX:1      DT:2016-11-19T17:33:46+0800 PL:ILLUMINA
@RG     ID:HCC1954N:HCC1954N:1:HMTV7CCXX:1-4A9AEB5A     SM:HCC1954N     LB:HCC1954N.1   PU:HCC1954N:HCC1954N:1:HMTV7CCXX:1      DT:2016-11-19T17:33:46+0800 PL:ILLUMINA
@RG     ID:HCC1954N:HCC1954N:1:HMTV7CCXX:1-3AB5E1D6     SM:HCC1954N     LB:HCC1954N.1   PU:HCC1954N:HCC1954N:1:HMTV7CCXX:1      DT:2016-11-19T17:33:49+0800 PL:ILLUMINA
@RG     ID:HCC1954N:HCC1954N:1:HMTV7CCXX:1-2B84817C     SM:HCC1954N     LB:HCC1954N.1   PU:HCC1954N:HCC1954N:1:HMTV7CCXX:1      DT:2016-11-19T17:33:46+0800 PL:ILLUMINA
@RG     ID:HCC1954N:HCC1954N:1:HMTV7CCXX:1-4DBC1BF5     SM:HCC1954N     LB:HCC1954N.1   PU:HCC1954N:HCC1954N:1:HMTV7CCXX:1      DT:2016-11-19T17:33:50+0800 PL:ILLUMINA
tobiasrausch commented 7 years ago

This appears to be a 10X Genomics Linked-Read data set. Delly only supports paired-end or mate-pair data.

wangyugui commented 7 years ago

Hi. Yes, this is a 10X Genomics Linked-Read data set.
but the bam is same as paired-end ,except there is multiple @RD in the bam?

tobiasrausch commented 7 years ago

I haven't tried any Chromium data but for the old 10X system another problem was the high soft-clipping rate (~3% compared to <0.5% for illumina paired-end libraries). In addition, the coverage was wavy so Delly's read-depth filters didn't work properly. In general, the advice from 10X at that time was to use 30X paired-end sequencing to call variants and 10X only for phasing. So I am rather pessimistic of calling SVs from 10X but when I find some time I will try to get hold of some Chromium data to see why Delly might give a segfault.

wangyugui commented 7 years ago

Hi.

10X Chromium data can be download from http://support.10xgenomics.com/genome-exome/datasets

2016-11-28 20:56 GMT+08:00, Tobias Rausch notifications@github.com:

I haven't tried any Chromium data but for the old 10X system another problem was the high soft-clipping rate (~3% compared to <0.5% for illumina paired-end libraries). In addition, the coverage was wavy so Delly's read-depth filters didn't work properly. In general, the advice from 10X at that time was to use 30X paired-end sequencing to call variants and 10X only for phasing. So I am rather pessimistic of calling SVs from 10X but when I find some time I will try to get hold of some Chromium data to see why Delly might give a segfault.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/dellytools/delly/issues/63#issuecomment-263264625

wangyugui commented 7 years ago

Please close the ticket.

This case is caused by miss-usage of 10X longranger software. I failed to find the error message of the miss-used command.

tobiasrausch commented 7 years ago

Thanks for letting me know.