Illumina / Isaac3

Aligner for sequencing data
Other
18 stars 2 forks source link

ERROR: ***** Internal Program Error - assertion ([...]) failed in isaac::build::GapRealigner::GapChoice #5

Closed sklages closed 8 years ago

sklages commented 8 years ago

Hi,

I have two PE125 fastq files (originally from bcl2fastq conversion) and wanted to map on grcm38 with 40 threads. After ~90 min isaac crashed (reproducible) with:

2016-09-09 12:48:09  [7f88fc664700] Loading unsorted data
2016-09-09 12:48:09  [7f88fc664700] Reading alignment records from BinMetadata(3id ReferencePosition(0:13680624:0f)bs 6840312bl 492742718ds 0do 0se 888256rs 869166f 
/scratch/cluster/mx/test_mouse/20160909_111202.mapping.mx.cpu40.vig1/3.ext_L7255-3_SJL.grcm38/temp/bin-00000001-00000003.dat)
2016-09-09 12:48:09  [7f88f865c700] ERROR: ***** Internal Program Error - assertion (undoneAlignmentPos <= int64_t(undoPivotGap.getEndPos(false).getPosition())) failed 
in isaac::build::GapRealigner::GapChoice isaac::build::GapRealigner::findBetterGapsChoice(const isaac::build::gapRealigner::GapsRange&, const isaac::reference::ReferencePosition&, const isaac::reference::ReferencePosition&, const ContigList&, const isaac::io::FragmentAccessor&, const isaac::build::PackedFragmentBuffer::Index&, unsigned int&):/scratch/local2/build/illumina/Isaac3/src/c++/lib/build/GapRealigner.cpp(1092): 
undoPivotPos pos ReferencePosition(0:3309721:0f) overlapped by an existing deletion PackedFragmentBuffer::Index(ReferencePosition(0:3309721:0f),271963004do 271963291mdo, 22M25D102M)

/scratch/cluster/mx/test_mouse/20160909_111202.mapping.mx.cpu40.vig1/3.ext_L7255-3_SJL.grcm38/run.3.isaac.sh: line 11: 
118927 Segmentation fault      
/package/sequencer/illumina/isaac/current/bin/isaac-align 
--base-calls /scratch/cluster/mx/test_mouse/20160909_111202.mapping.mx.cpu40.vig1/3.ext_L7255-3_SJL.grcm38
--base-calls-format fastq-gz 
--default-adapters Standard 
--reference-genome /project/genomes/Mus_musculus/NCBI/GRCm38/Sequence/iSAACIndex/sorted-reference.xml 
--realign-gaps sample
--scatter-repeats 1
--single-library-samples 0 
--keep-duplicates 1 
--bam-header-tag "@RG\tID:ext_L7255-3_SJL\tLB:ext_L7255-3_SJL\tSM:ext_L7255-3_SJL\tPL:illumina\tCN:MPIMG\tDS:xxx, Mouse Whole Genome Sequencing (WGS)" 
--keep-unaligned back 
--lane-number-max 3 
--mark-duplicates 1 
--clip-overlapping 1 
--clip-semialigned 0 
--description "xxx, Mouse Whole Genome Sequencing (WGS)" --jobs 40 
--memory-limit 80 
--input-concurrent-load 40 
--temp-concurrent-load 8 
--output-concurrent-save 40 
--temp-concurrent-save 40 
--cleanup-intermediary 1 
--verbosity 3 
--output-directory /scratch/cluster/mx/test_mouse/20160909_111202.mapping.mx.cpu40.vig1/3.ext_L7255-3_SJL.grcm38 
--temp-directory /scratch/cluster/mx/test_mouse/20160909_111202.mapping.mx.cpu40.vig1/3.ext_L7255-3_SJL.grcm38/temp 
--realign-vigorously 1

Setting --realign-vigorously 1 to 0 (which is the default) works like a charm.

This is Version: iSAAC-03.16.06.06.

best, Sven

rpetrovski commented 8 years ago

--realign-vigorously repeats gap realignment if the previous realignment result has made the alignment longer and thus overlapping more candidate gaps.

Would you be able to locate the region in the bam file that gets produced with --realign-vigorously 0 and post the bamlet? The region is around the position 3309721 in whatever is the first contig of the reference. I will also need a link to where I can download the exact same reference fa file as you are using.

Roman.

sklages commented 8 years ago

This is the reference, downloaded from iGenomes[1], URL removed

The index has been created with

isaac-sort-reference \
  --output-directory iSAACindex \
  --jobs 1 \
  --mask-width 0 \
  --genome-file genome.fa

What do you mean by "bamlet"? Just a screenshot of the alignment?

best, Sven

[1]=Mus_musculus/NCBI/GRCm38/Mus_musculus_NCBI_GRCm38.tar.gz

rpetrovski commented 8 years ago

By bamlet I mean the binary file produced by samtools view 10:3308651-10:3310651 >bamlet.bam

This should allow me to debug that particular failure.

Roman.

sklages commented 8 years ago

oops, .. :+1: .. here we go : URLs removed best, Sven

rpetrovski commented 8 years ago

Cheers, looks like I have all I need for the moment.

rpetrovski commented 8 years ago

Sven, I've just pushed the latest iSAAC-03 along with the fixes for vigorous realignment. Please let me know if you experience further issues.

Roman.

sklages commented 8 years ago

Hi Roman, isaac crashed after ~2h (running with 80 threads):

2016-09-14 17:15:10     [7fc19a2a37c0]  Version: iSAAC-03.16.09.13
[...]
2016-09-14 19:29:43  [7fbfa8b20700] Filtering duplicates
2016-09-14 19:29:43  [7fbf76abc700] ERROR: ***** Internal Program Error 
- assertion (gaps_ <= sizeof(BitTrackingType) * BITS_IN_BYTE) failed in 
isaac::build::gapRealigner::ChooseKGapsFilter<BitTrackingType>::ChooseKGapsFilter(const isaac::build::gapRealigner::GapsRange&, unsigned int) [with BitTrackingType = 
long unsigned int]:/scratch/local2/build/illumina/Isaac3/src/c++/include/build/gapRealigner/ChooseKGapsFilter.hh(48): Too many gaps: 132

/scratch/cluster/mx/test_mouse/20160914_110313.mapping.mx.realign-vigorously/1.ext_L7255-3_SJL.grcm38/run.1.isaac.sh: 
line 11: 13526 Segmentation fault      

/package/sequencer/illumina/isaac/current/bin/isaac-align 
[...]
--realign-vigorously 1

best, Sven

rpetrovski commented 8 years ago

This does not have an indication of the genomic region and it will not be easy to get it. Do you still succeed without realign-vigorously? Would you be able to let me download the entire bam file that gets produced in that case?

P.S. one trick you can do to resume from a failure during bam generation is to run with --start-from Bam provided the Temp folder is still untouched.

Roman.

sklages commented 7 years ago

Here is the truncated BAM file, http://comanche.molgen.mpg.de/illumina/2PvJ2pPxRqrHcR/truncated.bam .

I have restarted alignment as suggested, --realign-vigorously 0 and --start-from Bam .. I'll report back ..

best, Sven