jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Other
119 stars 15 forks source link

bam (but not sam) reader crashes on '@CO' line in header #96

Closed plijnzaad closed 4 years ago

plijnzaad commented 4 years ago

With the latest version (see PR #94) I get a

AttributeError: 'str' object has no attribute 'items'

if the bam file that contains a @CO header line. To reproduce, run the following commands on this file: TM433_trunc.bam.pdf which is not a PDF file, but github doesn't allow bam files

# rename back to proper bam file (this is not a pdf file but but github doesn't allow bam files)
mv -f TM433_trunc.bam.pdf TM433_trunc.bam
# create bam without @CO line:
samtools view  -h TM433_trunc.bam | grep -v '^@CO' | samtools view -b - > TM433_withoutCO.bam
# works fine without @CO line:
atropos -a RA5=GATCGTCGGACTGTAGAACTCTGAAC --input-format bam -se TM433_withoutCO.bam --output-format sam -o TM433_trimmed2.sam --report-file summary8.txt 
# crashes with @CO line
atropos -a RA5=GATCGTCGGACTGTAGAACTCTGAAC --input-format bam -se TM433_trunc.bam --output-format sam -o TM433_trimmed2.sam --report-file summary8.txt 
plijnzaad commented 4 years ago

Just in case you get different behaviour, here is my stderr output of the above incantation.

(BTW, the setup this time is with python 3.6; pysam 0.15.3; cython 0.29.14 )

plijnzaad commented 4 years ago

quick note: @CO is the only legal header line that is nearly completely free form. Just one tab is required, there are no internal tags like all the other header lines, I guess this trips up SAMParser::add_header()

jdidion commented 4 years ago

Ah yes, I see. Nice catch. Will be fixed in the next alpha release going out today.

jdidion commented 4 years ago

The fix is in 2.0.0-alpha.5, along with the fix for #97. Please test it out. Thanks

plijnzaad commented 4 years ago

Thanks, yes, this works. Writing bam (see issue #95 ) still does not work, do you have an idea when this might become available? Cheers, Philip

plijnzaad commented 4 years ago

Side note: when using bamnostic, atropos still crashes on input bam files containing @CO header lines; see issue #95