Kingsford-Group / squid

SQUID detects both fusion-gene and non-fusion-gene structural variations from RNA-seq data
BSD 3-Clause "New" or "Revised" License
40 stars 22 forks source link

void segmentGraph aborted #14

Open srajedwin opened 5 years ago

srajedwin commented 5 years ago

I attempt to identify the variants in our samples using squid. I aligned the reads using STAR 2-pass including chimeric bam.

Squid run was aborted after build nodes, finish seeding with the following errors:

[Tue Oct 23 14:52:31 2018] Start reading bam file. [Tue Oct 23 15:31:14 2018] Finish sorting Chimeric bam reads. [Tue Oct 23 15:45:26 2018] Finish removing PCR duplicates. [Tue Oct 23 15:49:56 2018] Building nodes. |bamdiscordant|=331264 [Tue Oct 23 15:50:05 2018] Building nodes, finish seeding.

squid: src/SegmentGraph.cpp:662: void SegmentGraph_t::BuildNode_STAR(const std::vector&, SBamrecord_t&, std::string): Assertion `vNodes[i].Length>0 && vNodes[i].Position+vNodes[i].Length<=RefLength[vNodes[i].Chr]' failed. /var/spool/pbs/mom_priv/jobs/7784253.wlm01.SC: line 36: 19875 Aborted squid -b PT_10_S3_L001Aligned.toTranscriptome.out.bam -c PT_10_S3_L002Aligned.sortedByCoord.out.bam -o PT_10_S3_L001_squid -G 1 -CO 1

Could you please suggest me a fix

Congm12 commented 5 years ago

This error occurs at the step of parsing the BAM file. Could you make sure that PT_10_S3_L001Aligned.toTranscriptome.out.bam is sorted by coordinate?

If it is sorted, there may be other special structures in the BAM file that break the BAM parsing part of the code. In this case, would it be possible to share with us the BAM files so that we can debug on our end?

srajedwin commented 5 years ago

Thanks for your response.

I sorted the transcriptome.out.bam and tried again running squid. The error was

/var/spool/pbs/mom_priv/jobs/7787431.wlm01.SC: line 38: 20925 Segmentation fault squid -b PT_10_S3_L001Aligned.toTranscriptome.sorted.bam -c PT_10_S3_L002Aligned.sortedByCoord.out.bam -o PT_10_S3_L001_squid -G 1 -CO 1

The bam file can be downloaded here <<https://drive.google.com/open?id=1z8JQk1NUgTl8qTsthrD4bUVYGqJfwsQ6 >>

Congm12 commented 5 years ago

Could you also provide the chimeric BAM file? The information in the chimeric alignments are also used in processing the concordant alignments, so I need to have both of them to reproduce the error.

srajedwin commented 5 years ago

Sorry for missing that,

I've edited the link to download chimeric BAM file, kindly let me know the fix. Thanks

https://drive.google.com/open?id=18Eu_OPWWPGyPNK8OG9Abrh00EK6Be9Q3

Congm12 commented 5 years ago

The concordant BAM file is aligned to the transcript, but the chimeric BAM file is aligned to the genome. The different references have different number of sequences and different lengths, and that's the cause of segment fault.

If you can align the concordant reads to the genome, and feed in the genome-aligned BAM file as the concordant BAM, the error should be fixed.