quachtina96 / pysam

Automatically exported from code.google.com/p/pysam
0 stars 0 forks source link

Samfile.header fails with internal error on specific file #113

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Pysam fails to read the header of one of my BAM files even though the file 
works fine with the command-line samtools. I attach the file.

See here:

--8<---[Begin Python session]---

Python 2.7.3 (default, Aug  1 2012, 05:14:39) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.

>>> import pysam

>>> pysam.__version__
'0.7.4'

>>> pysam.Samfile("test.bam").header
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "csamtools.pyx", line 1279, in csamtools.Samfile.header.__get__ (pysam/csamtools.c:12881)
ValueError: need more than 1 value to unpack
>>> type( pysam.Samfile("test.bam") )
<type 'csamtools.Samfile'>

---8<--[End Python session]---

To see that the the command-line samtools behave normally:

---8<---[Begin shell session]--- 
anders@pinotnoir:~/tmp$ samtools view -h test.bam 
@HD VN:1.0  SO:unsorted
@SQ SN:1    LN:249250621
@SQ SN:2    LN:243199373
@SQ SN:3    LN:198022430
@SQ SN:4    LN:191154276
@SQ SN:5    LN:180915260
@SQ SN:6    LN:171115067
@SQ SN:7    LN:159138663
@SQ SN:8    LN:146364022
@SQ SN:9    LN:141213431
@SQ SN:10   LN:135534747
@SQ SN:11   LN:135006516
@SQ SN:12   LN:133851895
@SQ SN:13   LN:115169878
@SQ SN:14   LN:107349540
@SQ SN:15   LN:102531392
@SQ SN:16   LN:90354753
@SQ SN:17   LN:81195210
@SQ SN:18   LN:78077248
@SQ SN:19   LN:59128983
@SQ SN:20   LN:63025520
@SQ SN:21   LN:48129895
@SQ SN:22   LN:51304566
@SQ SN:Y    LN:59373566
@SQ SN:X    LN:155270560
@SQ SN:MT   LN:16571
@PG ID=Bowtie   VN=0.11.3   CL="bowtie -S Homo_sapiens.GRCh37.56 SRR001432.fastq 
SRR001432.sam"
SRR001432.1 USI-EAS21_0008_3445:8:1:107:882 
length=25   4   *   0   0   *   *   00  TAGCTGCTTCATTATGTGTTGTCTT   IIIIIIIIII,II*IIII*I8'*I7   XM:
i:0
SRR001432.2 USI-EAS21_0008_3445:8:1:82:90 
length=25   4   *   0   0   *   *   00  GAAGCCACAAGCACCCGGCTCCGCC   IIIII)II4,I*6I+&III'7:'%&   XM:
i:0
SRR001432.3 USI-EAS21_0008_3445:8:1:639:904 
length=25   4   *   0   0   *   *   00  GATGGAAGAGCTCGTATGCCGTCTT   <IIIIIIIIIIIII@IEIIII/I=I   XM:
i:0
SRR001432.4 USI-EAS21_0008_3445:8:1:806:766 
length=25   4   *   0   0   *   *   00  GCCTGCTTCACCAAAATTTAAATAA   I#I@;&IGF.DI+.+"!9%&3,.)&   XM:
i:0
---8<---[end shell session]---

Version information:

- Ubuntu Linux 12.04 x86_64
- Python 2.7.3
- pysam 0.7.4
- command-line samtools 0.1.18 (r982:295)

Original issue reported on code.google.com by sanders....@gmail.com on 20 Feb 2013 at 9:54

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks. The issue is a malformatted header line.

The sam format asks header lines to use : instead of =.

I have created a better error message.

Best wishes,
Andreas

Original comment by andreas....@gmail.com on 27 Jun 2013 at 1:46