centre-for-microbiome-research / GroopM

Metagenomic binning suite
GNU General Public License v3.0
29 stars 18 forks source link

Error in reading in Bam files is groopm parse #9

Open jenmobberley opened 9 years ago

jenmobberley commented 9 years ago

What Bam format is needed to input in groopm? I have tried a couple different BAM conversions and sorting with samtools (see BAM generation), I get errors about the BAM files (error output below for groopm parse). As samtools is pretty standard for BAM<->SAM, I was wondering if you'd have any insights into this error.

BAM generation: From a IDBA-ud co-assembly of my data, I used bowtie2 to generate 5 SAM files for each of my 5 samples: bowtie2 -x -U sample.interleaved.fasta -f -N1 -I200 -X500 --no-unal

I then convert to a sorted and indexed BAM file using the following samtools pipeline: samtools view -b -S -u sample.sam | samtools sort | samtools index

GROOPM error log: $ groopm parse all.gm all_mat_sections.combined.IDBAscaf.fa mat_1_bowtie.bam.bai mat_2_bowtie.bam.bai mat_3_bowtie.bam.bai mat_4_bowtie.bam.bai mat_5_bowtie.bam.bai


[[GroopM 0.2.10.17]] Running in data parsing mode...


_WARNING_\ Database: 'all.gm' exists. If you continue you WILL delete any previous analyses! Overwrite? (y,n) : y


Overwriting database all.gm Parsing contigs Importing BAM files Unable to open BAM file mat_1_bowtie.bam.bai -- did you supply a SAM file instead? Error creating database: all.gm <type 'exceptions.ValueError'> Unexpected error: <type 'exceptions.ValueError'> Traceback (most recent call last): File "/usr/local/bin/groopm", line 338, in GM_parser.parseOptions(args) File "/usr/local/lib/python2.7/site-packages/groopm/groopm.py", line 111, in parseOptions force=options.force) File "/usr/local/lib/python2.7/site-packages/groopm/mstore.py", line 253, in createDB cid_2_indices) File "/usr/local/lib/python2.7/site-packages/groopm/mstore.py", line 1690, in parse (links, ref_lengths, coverages) = BP.getLinks(bamFiles, full=False, verbose=True, doCoverage=True, minJoin=5) File "/usr/local/lib/python2.7/site-packages/bamtyper/utilities.py", line 140, in getLinks bam_types = self.getTypes(bamFiles) File "/usr/local/lib/python2.7/site-packages/bamtyper/utilities.py", line 727, in getTypes bam_file = pysam.Samfile(bf, 'rb') File "csamtools.pyx", line 597, in csamtools.Samfile.cinit (pysam/csamtools.c:5982) File "csamtools.pyx", line 751, in csamtools.Samfile._open (pysam/csamtools.c:7590) ValueError: file does not have valid header (mode='rb') - is it BAM format?

minillinim commented 9 years ago

This sounds like an old bug. GroopM is now at version 0.3.0

Could you please try to upgrade to the latest version and see if the bug persists.