Closed keiranmraine closed 8 years ago
Hi,
it looks like the input file contains reads which
This is a bit unusual as it probably means there was at some point a mate of a read which mapped to a reference sequence but was then removed at a later stage, leaving only the unmapped mate behind.
I have updated libmaus2/biobambam2 to print a warning when encountering this but not to quit indexing. Can you please try again with biobambam2 version 2.0.30?
Best, German
Hi German,
This is now working. I suspect the cause is that our legacy pipeline uses rmdup=1
.
$ ./bammarkduplicates2 rmdup=1 index=1 tmpfile=thingy I=4911821_sorted.bam O=fixed.bam M=fixed.met indexfilename=fixed.bam.bai level=1
[V] output compression level 1
[D] excntpairs=1 fincntpairs=441971 strcntpairs=145
[D] excntfrags=2 fincntfrags=889084 strcntfrags=0
[V] fragment and pair data computed in time 5.65365 (05:65362499)
[V] 893938 lines, 893938 als, 889086 mapped frags, 442117 mapped pairs, 190788 frags/s MemUsage(size=71.8516,rss=24.418,peak=504.395)
[V] Checking pairs...done, rate 11932
[V] Checking single fragments...done, rate 343.997
[V] number of alignments marked as duplicates: 135 time 5.68297 (05:68296299)
[V] Filtered 893937(0,1) total for marking time 07:66722199 MemUsage(size=71.8516,rss=24.6719,peak=504.395)
[W] BamIndexGenerator::checkConsisteny(): warning, bin chunks for refid 23 without corresponding linear chunks
[W] BamIndexGenerator::flush: warning, bin index and linear index look inconsistent
[V] MemUsage(size=71.8516,rss=24.6719,peak=504.395) 14.3264 (14:32641100)
$ echo $?
0
Hi Keiran,
thank you for confirming this is fixed now.
Hi,
We have a BAM file that only has a very small number of reads which only hit a handful of the reference sequences. It appears that this triggers a check failure:
The behaviour was initially noticed running bammarkduplicates2 with
index=1
.Samtools indexes the file fine with the following stats: