gt1 / biobambam2

Tools for early stage alignment file processing
Other
93 stars 17 forks source link

bammarkduplicates piping to stdout problem #36

Closed steffenheyne closed 7 years ago

steffenheyne commented 7 years ago

Hi, thanks for this efficient tool suite!

I have an issue when I don't provide an output file directly with "O=" to bammarkduplicates2 in order to pipe the stdout to samtools again. It seems that bammarkduplicates itself has some problem with this and stops with an error after the first phase:


$ samtools view -h -f 2 -F 780 MY.BAM | samtools view -b -u - | ~/install/biobambam2-2.0.60/bin/bammarkduplicates2 markthreads=10 inputbuffersize=1310720 level=0 verbose=1 rewritebam=2 2>bamdup.log | samtools view -@10 -b - >MYBAM.bammarkdup.bam

... [V] 657457153 als, 657457153 mapped frags, 326713918 mapped pairs, 202604 frags/s MemUsage(size=928.965,rss=178.41,peak=1276.28) time 2.90766 total 54:05:70748000 [V] 658505728 als, 658505728 mapped frags, 327220159 mapped pairs, 202668 frags/s MemUsage(size=928.965,rss=178.41,peak=1276.28) time 4.15273 total 54:09:86047099 [D] excntpairs=58079412 fincntpairs=169475253 strcntpairs=100066447 [D] excntfrags=138983750 fincntfrags=520351682 strcntfrags=0 [V] fragment and pair data computed in time 3253.13 (54:13:12949200) [V] 659335432 lines, 659335432 als, 659335432 mapped frags, 327621112 mapped pairs, 202719 frags/s MemUsage(size=784.934,rss=82.3555,peak=1276.28) [V] Checking pairs...done, rate 1.80079e+06 [V] Checking single fragments...done, rate 2.06009e+06 [V] number of alignments marked as duplicates: 412919402 time 3414.26 (56:54:26030099) # /home/heyne/install/biobambam2-2.0.60/bin/bammarkduplicates2 markthreads=10 inputbuffersize=1310720 level=0 verbose=1 rewritebam=2

##METRICS LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED UNMAPPED_READS UNPAIRED_READ_DUPLICATES READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE Unknown Library 0 327621112 0 0 206459701 0 0.630178 132273373

## HISTOGRAM BIN VALUE 1 1 2 1.08401 3 1.09106 4 1.09166 ... 100 1.09171 [D] using incremental BAM header parser on parallel recoder.

BgzfInflateHeaderBase::readHeader(): invalid header data (unexpected bytes) /home/heyne/install/libmaus2--2.0.281/lib/libmaus2.so.2(_ZN8libmaus24util10StackTraceC1Ev+0x54) [0x7f25c0747f94] /home/heyne/install/biobambam2-2.0.60/bin/bammarkduplicates2(_ZN8libmaus29exception16LibMausExceptionC1Ev+0x20) [0x444020] /home/heyne/install/biobambam2-2.0.60/bin/bammarkduplicates2(_ZN8libmaus22lz15BgzfInflateBase9readBlockISiEENS113BaseBlockInfoERT+0x50) [0x448920] /home/heyne/install/biobambam2-2.0.60/bin/bammarkduplicates2(ZN8libmaus22lz16BgzfInflateBlock9readBlockISiEEbRT+0x41) [0x44caf1] /home/heyne/install/biobambam2-2.0.60/bin/bammarkduplicates2(_ZN8libmaus22lz32BgzfInflateDeflateParallelThread3runEv+0xc8) [0x495928] /home/heyne/install/biobambam2-2.0.60/bin/bammarkduplicates2(_ZN8libmaus28parallel11PosixThread8dispatchEPv+0x18) [0x442aa8] /lib64/libpthread.so.0(+0x7dc5) [0x7f25befc2dc5] /lib64/libc.so.6(clone+0x6d) [0x7f25becefced]


When I use "O=MYBAM.bammarkdup.bam" without piping all works fine.

$samtools view -h -f 2 -F 780 -q 10 MY.BAM | samtools view -b -u - | ~/install/biobambam2-2.0.60/bin/bammarkduplicates2 markthreads=10 inputbuffersize=1310720 verbose=1 O=MYBAM.bammarkdup.tmp.bam

What is the right way to use piping to stdout? (I have to admit that I never tried piping with "verbose=0")

gt1 commented 7 years ago

Hello,

thank you for reporting this. Could you please retry with version 2.0.63 (and the most recent version of libmaus2, which contains the actual fix)?

German

steffenheyne commented 7 years ago

hi,

great, ist works now! Thanks a lot! I tried it with 2.0.65!