centre-for-microbiome-research / GroopM

Metagenomic binning suite
GNU General Public License v3.0
29 stars 18 forks source link

IndexError while building database using groopm parse #8

Open dburkhardt opened 9 years ago

dburkhardt commented 9 years ago

First off, thanks for working on this, I'm super excited to use this on some soil metagenome data I've been playing with.

I started to parse three mapping files and build a database, and I encountered an index error. Is it because my bam files are too large? I was surprised this happened even though I only used three files as a test run. The bam files are 1.3G, 8.4G and 1.4G.

I noticed the following line in my output

* Error in `/usr/bin/python': double free or corruption (out): 0x000000005ac1da20 *

This looked familiar because I saw something similar when running nosetests. I've pasted the output from nosetests at the end of this.

GroopM_parse.log: $ groopm parse test.gm ~/binning_files/1018256.scaffolds.fasta TGACCA.bam TTAGGC.bam TCGGCA.bam


[[GroopM 0.3.3]] Running in data parsing mode...


Creating new database test.gm Parsing contigs Parsing BAM files using 1 threads Parsing file: TGACCA.bam Parsing file: TTAGGC.bam * Error in `/usr/bin/python': double free or corruption (out): 0x000000005ac1da20 * Error creating database: test.gm <type 'exceptions.IndexError'> Unexpected error: <type 'exceptions.IndexError'> Traceback (most recent call last): File "/usr/local/bin/groopm", line 381, in GM_parser.parseOptions(args) File "/usr/local/lib/python2.7/dist-packages/groopm/groopm.py", line 117, in parseOptions threads=options.threads) File "/usr/local/lib/python2.7/dist-packages/groopm/mstore.py", line 276, in createDB threads) File "/usr/local/lib/python2.7/dist-packages/groopm/mstore.py", line 1773, in parse return ([BP.BFI.bamFiles[i].fileName for i in range(len(bamFiles))],

IndexError: list index out of range

Nosetests.log: dan@computobacter:~/software/BamM$ nosetests .* Error in `/usr/bin/python': double free or corruption (fasttop): 0x0000000001b80bc0 ** Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001b80bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001b80bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001bb5bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001bb5bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001bb5bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000017c2bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000017c2bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000017c2bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x000000000132ebc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x000000000132ebc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x000000000132ebc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000025e1060 Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000025e1060 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000025e1060 Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000012f6bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000012f6bc0 Error in`/usr/bin/python': double free or corruption (fasttop): 0x00000000012f6bc0 ***

............................................

Ran 45 tests in 7.338s

OK

minillinim commented 9 years ago

Hey,

This error most often occurs when BamM is not installed properly. Do you have the install logs? If not, you can re-run the install procedure and send me a copy of what spews out to the screen. Then I can look into it.

Thanks.

dburkhardt commented 9 years ago

I definitely had some issues installing at almost every step. (The htslib and libcfu dependencies were the worst) Here’s the output for from setup.py install in BamM. Should I try reinstalling everything again? Do you want to output for the dependencies?

Dan

BamM install output: $ sudo python setup.py install [sudo] password for dan: checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking for g++... g++ checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for lgamma,log,exp in -lm... yes checking for libcfu headers in /usr/local/include... found checking for libcfu libraries in /usr/local/lib... found checking for libhts headers in /usr/local/include... found checking for libhts libraries in /usr/local/lib... found configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: config.h is unchanged rm -f bamParser rm -f bamExtractor rm -f *.o rm -f libBamM.a gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o bamExtractor.o bamExtractor.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o bamParser.o bamParser.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o pairedLink.o pairedLink.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o bamRead.o bamRead.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o coverageEstimators.o coverageEstimators.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -static-libgcc -shared -Wl,-rpath,/usr/local/lib,-soname,libBamM.so.0 -o libBamM.a bamExtractor.c bamParser.c pairedLink.c bamRead.c coverageEstimators.c -lm -L/usr/local/lib -lcfu -L/usr/local/lib -lhts Building library /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires' warnings.warn(msg) running install running build running build_py running build_scripts running install_lib running install_scripts changing mode of /usr/local/bin/bamm to 775 changing mode of /usr/local/bin/bamFlags to 775 running install_data copying c/libBamM.a -> /usr/local/lib/python2.7/dist-packages/bamm/c/ running install_egg_info Removing /usr/local/lib/python2.7/dist-packages/BamM-1.3.1.egg-info Writing /usr/local/lib/python2.7/dist-packages/BamM-1.3.1.egg-info

On Jan 8, 2015, at 8:06 PM, Michael Imelfort notifications@github.com wrote:

Hey,

This error most often occurs when BamM is not installed properly. Do you have the install logs? If not, you can re-run the install procedure and send me a copy of what spews out to the screen. Then I can look into it.

Thanks. — Reply to this email directly or view it on GitHub https://github.com/minillinim/GroopM/issues/8#issuecomment-69278701.

minillinim commented 9 years ago

Hi,

Your install looks OK. I checked through the code and I've removed some dodgyness with some of the free statements. I don't know if this will fix your problems or not, but thanks for pointing me to a bit of code I can fix. The new version is 1.3.2. could you pull this down and run the nosetests on it please.

dburkhardt commented 9 years ago

I still get the same nosetests errors. I pulled, reran setup.py install, and then ran nosetests. Heres the output and thanks again for your help:

$ git pull remote: Counting objects: 10, done. remote: Compressing objects: 100% (10/10), done. remote: Total 10 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (10/10), done. From https://github.com/minillinim/BamM 7c942a1..8fa33a4 master -> origin/master Updating 7c942a1..8fa33a4 Fast-forward bin/bamm | 2 +- c/bamExtractor.c | 6 ++++-- c/bamParser.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------- c/bamRead.c | 22 ++++++++++++++-------- c/pairedLink.c | 57 +++++++++++++++++++++++++++++++++++++++++++++------------ setup.py | 2 +- 6 files changed, 172 insertions(+), 53 deletions(-) $ sudo python setup.py install [sudo] password for dan: checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking for g++... g++ checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for lgamma,log,exp in -lm... yes checking for libcfu headers in /usr/local/include... found checking for libcfu libraries in /usr/local/lib... found checking for libhts headers in /usr/local/include... found checking for libhts libraries in /usr/local/lib... found configure: creating ./config.status config.status: creating Makefile config.status: creating config.h config.status: config.h is unchanged rm -f bamParser rm -f bamExtractor rm -f *.o rm -f libBamM.a gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o bamExtractor.o bamExtractor.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o bamParser.o bamParser.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o pairedLink.o pairedLink.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o bamRead.o bamRead.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -c -o coverageEstimators.o coverageEstimators.c gcc -g -fPIC -pthread -O2 -Wall -I/usr/local/include -I/usr/local/include -static-libgcc -shared -Wl,-rpath,/usr/local/lib,-soname,libBamM.so.0 -o libBamM.a bamExtractor.c bamParser.c pairedLink.c bamRead.c coverageEstimators.c -lm -L/usr/local/lib -lcfu -L/usr/local/lib -lhts Building library /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires' warnings.warn(msg) running install running build running build_py running build_scripts copying and adjusting bin/bamm -> build/scripts-2.7 running install_lib running install_scripts copying build/scripts-2.7/bamm -> /usr/local/bin changing mode of /usr/local/bin/bamm to 775 changing mode of /usr/local/bin/bamFlags to 775 running install_data copying c/libBamM.a -> /usr/local/lib/python2.7/dist-packages/bamm/c/ running install_egg_info Writing /usr/local/lib/python2.7/dist-packages/BamM-1.3.2.egg-info $ bamm

                          ...::: BamM :::...

                Working with the BAM, not against it...

                              version: 1.3.2

bamm make     ->  Make BAM/TAM files (sorted + indexed)
bamm parse    ->  Get coverage profiles / linking reads / insert types
bamm extract  ->  Extract reads / headers from BAM files

USE: bamm OPTION -h to see detailed options

$ nosetests .* Error in `/usr/bin/python': double free or corruption (fasttop): 0x0000000000fe9bc0 ** Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000000fe9bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000000fe9bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001fe5bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001fe5bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001fe5bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002017bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002017bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002017bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002c42bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002c42bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002c42bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001e40bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001e40bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001e40bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001aebbc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001aebbc0 Error in`/usr/bin/python': double free or corruption (fasttop): 0x0000000001aebbc0 ***

............................................

Ran 45 tests in 7.999s

OK

On Jan 9, 2015, at 1:53 AM, Michael Imelfort notifications@github.com wrote:

Hi,

Your install looks OK. I checked through the code and I've removed some dodgyness with some of the free statements. I don't know if this will fix your problems or not, but thanks for pointing me to a bit of code I can fix. The new version is 1.3.2. could you pull this down and run the nosetests on it please. — Reply to this email directly or view it on GitHub https://github.com/minillinim/GroopM/issues/8#issuecomment-69299982.

dburkhardt commented 9 years ago

I'm still getting the same error with version 1.3.3. I got weird output from nosetests the first time so I ran it again. See below.

dan$ bamm

                          ...::: BamM :::...

                Working with the BAM, not against it...

                              version: 1.3.3

bamm make     ->  Make BAM/TAM files (sorted + indexed)
bamm parse    ->  Get coverage profiles / linking reads / insert types
bamm extract  ->  Extract reads / headers from BAM files

USE: bamm OPTION -h to see detailed options

dan$ nosetests .* Error in `/usr/bin/python': double free or corruption (fasttop): 0x0000000001850bc0 ** Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001850bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001850bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x000000000285fbc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x000000000285fbc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x000000000285fbc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001220bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001220bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001c66bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001c66bc0 *** **\* Error in/usr/bin/python': Error in`/usr/bin/python': double free or corruption (fasttop): 0x0000000001687bc0 * Error in `/usr/bin/python': double free or corruption (fasttop): 0x0000000001687bc0 ** Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002930bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002930bc0 Error in`/usr/bin/python': double free or corruption (fasttop): 0x0000000002930bc0


............................................

Ran 45 tests in 8.798s

OK dan$ nosetests .* Error in `/usr/bin/python': double free or corruption (fasttop): 0x00000000021aabc0 ** Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000021aabc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001aa0bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000001aa0bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000000f4bbc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000000f4bbc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000000f4bbc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002a6bbc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002a6bbc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002ad5bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002ad5bc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x0000000002ad5bc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000029edbc0 Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000029edbc0 *** **\* Error in/usr/bin/python': double free or corruption (fasttop): 0x00000000029edbc0 ***

............................................

Ran 45 tests in 8.387s

OK

minillinim commented 9 years ago

The error is within python. I can create a similar problem if I use ctypes with Python 2.6.x. I've uploaded a small script to the BamM web page called sysInfo.py. Could you please run that and post the output here. Thanks.

dburkhardt commented 9 years ago

Sure! Thanks again for your help.

dan@computobacter:~/archive/01.06.15_Prepping_for_GroopM$ python sysInfo.py Tables version: 3.1.1 Location: /usr/lib/python2.7/dist-packages/tables/init.pyc Numpy version: 1.9.1 Location: /usr/local/lib/python2.7/dist-packages/numpy/init.pyc Scipy version: 0.13.3 Location: /usr/lib/python2.7/dist-packages/scipy/init.pyc MPL version: 1.3.1 Location: /usr/lib/pymodules/python2.7/matplotlib/init.pyc Linux 3.13.0-37-generic 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] ['/home/dan/archive/01.06.15_Prepping_for_GroopM', '/usr/local/lib/python2.7/dist-packages/pplacer_scripts-unknown-py2.7.egg', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client']

On Jan 15, 2015, at 12:45 AM, Michael Imelfort notifications@github.com wrote:

The error is within python. I can create a similar problem if I use ctypes with Python 2.6.x. I've uploaded a small script to the BamM web page called sysInfo.py. Could you please run that and post the output here. Thanks.

— Reply to this email directly or view it on GitHub https://github.com/minillinim/GroopM/issues/8#issuecomment-70042574.

minillinim commented 9 years ago

Should be fixed in BamM 1.3.4

Hopefully...

Thanks for your patience

dburkhardt commented 9 years ago

So close! I don't see those double free errors, but one of the tests fails now. Here's the output:

dan@computobacter:~$ bamm

                          ...::: BamM :::...

                Working with the BAM, not against it...

                              version: 1.3.5

bamm make     ->  Make BAM/TAM files (sorted + indexed)
bamm parse    ->  Get coverage profiles / linking reads / insert types
bamm extract  ->  Extract reads / headers from BAM files

USE: bamm OPTION -h to see detailed options

dan@computobacter:~$ cd software/BamM/ dan@computobacter:~/software/BamM$ nosetests ......E......................................

ERROR: Test creation of TAM file with output prefix.

Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/dan/software/BamM/bamm/tests/test_cli.py", line 133, in testOutputTam quiet=True) File "/home/dan/software/BamM/bamm/bamMaker.py", line 131, in init raise InvalidParameterSetException('Specified database is not a valid file') InvalidParameterSetException: Specified database is not a valid file

Ran 45 tests in 6.849s

FAILED (errors=1)

On Jan 16, 2015, at 1:44 AM, Michael Imelfort notifications@github.com wrote:

Should be fixed in BamM 1.3.4

Hopefully...

Thanks for your patience — Reply to this email directly or view it on GitHub https://github.com/minillinim/GroopM/issues/8#issuecomment-70214755.

minillinim commented 9 years ago

oops. Deleted some files accidentally., Fixed now (1.3.6)

dburkhardt commented 9 years ago

This resolves my issue. GroopM and BamM both seem to work now. Thank you so much for your help!

Dan

On Jan 18, 2015, at 11:52 PM, Michael Imelfort notifications@github.com wrote:

oops. Deleted some files accidentally., Fixed now (1.3.6) — Reply to this email directly or view it on GitHub https://github.com/minillinim/GroopM/issues/8#issuecomment-70446312.

minillinim commented 9 years ago

Yay!