centre-for-microbiome-research / GroopM

Metagenomic binning suite
GNU General Public License v3.0
29 stars 18 forks source link

groopm parse fails with "Distance matrix 'X' must be symmetric" #23

Open fungs opened 8 years ago

fungs commented 8 years ago

Hello,

I'm running GroopM on a simulated dataset with small (1 kb) contigs. The parse step failed with the following error:

*******************************************************************************
 [[GroopM 0.3.4]] Running in data parsing mode...
*******************************************************************************
Creating new database db.gm
Parsing contigs
Parsing BAM files using 40 threads
Parsing file: genomes.primary.sorted.bam
Parsing file: genomes.secondary.1.sorted.bam
Parsing file: genomes.secondary.2.sorted.bam
Parsing file: genomes.secondary.3.sorted.bam
****************************************************************
 IMPORTANT! - there are 1 contigs with 0 coverage
 across all stoits. They will be ignored:
****************************************************************
substr(2500001,2501000)_genome123
****************************************************************
    Reticulating splines
    Dimensionality reduction
Error creating database: db.gm <type 'exceptions.ValueError'>
Unexpected error: <type 'exceptions.ValueError'>
Traceback (most recent call last):
  File "/home/johdro/tmp/GroopM-0.3.4/bin/groopm", line 381, in <module>
    GM_parser.parseOptions(args)
  File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/groopm.py", line 117, in parseOptions
    threads=options.threads)
  File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/mstore.py", line 359, in createDB
    CT.transformCP()
  File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/mstore.py", line 1824, in transformCP
    self.shuffleBAMs()
  File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/mstore.py", line 1916, in shuffleBAMs
    dists = squareform(sq_dists)
  File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/scipy/spatial/distance.py", line 1519, in squareform
    is_valid_dm(X, throw=True, name='X')
  File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/scipy/spatial/distance.py", line 1602, in is_valid_dm
    'symmetric.') % name)
ValueError: Distance matrix 'X' must be symmetric.

Any clue what can be the cause?

timbalam commented 8 years ago

Hi Johannes,

Thanks for the report.

I'm currently developing a new version of GroopM. As I am focusing on that, I won't be fixing the old version, sorry. This issue should not appear in the new version.

Cheers, Tim

On 7/07/2016 1:28 AM, Johannes Dröge wrote:

Hello,

I'm running GroopM on a simulated dataset with small (1 kb) contigs. The parse step failed with the following error:

|*** [[GroopM 0.3.4]] Running in data parsing mode...


Creating new database db.gm Parsing contigs Parsing BAM files using 40 threads Parsing file: genomes.primary.sorted.bam Parsing file: genomes.secondary.1.sorted.bam Parsing file: genomes.secondary.2.sorted.bam Parsing file: genomes.secondary.3.sorted.bam


IMPORTANT! - there are 1 contigs with 0 coverage across all stoits. They will be ignored:


substr(2500001,2501000)_genome123


Reticulating splines Dimensionality reduction Error creating database: db.gm <type 'exceptions.ValueError'> Unexpected error: <type 'exceptions.ValueError'> Traceback (most recent call last): File "/home/johdro/tmp/GroopM-0.3.4/bin/groopm", line 381, in GM_parser.parseOptions(args) File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/groopm.py", line 117, in parseOptions threads=options.threads) File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/mstore.py", line 359, in createDB CT.transformCP() File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/mstore.py", line 1824, in transformCP self.shuffleBAMs() File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/groopm/mstore.py", line 1916, in shuffleBAMs dists = squareform(sq_dists) File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/scipy/spatial/distance.py", line 1519, in squareform is_valid_dm(X, throw=True, name='X') File "/home/johdro/tmp/GroopM-0.3.4/local/lib/python2.7/site-packages/scipy/spatial/distance.py", line 1602, in is_valid_dm 'symmetric.') % name) ValueError: Distance matrix 'X' must be symmetric. |

Any clue what can be the cause?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Ecogenomics/GroopM/issues/23, or mute the thread https://github.com/notifications/unsubscribe/AHD5ZWgeS5F0GPcKYy3TLwqyD39o8tK-ks5qS8mfgaJpZM4JGM7R.

fungs commented 8 years ago

Is there any version that I can try?

timbalam commented 8 years ago

It's not quite ready to be released yet, sorry. I'll let you know.

On 7/07/2016 5:33 PM, Johannes Dröge wrote:

Is there any version that I can try?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Ecogenomics/GroopM/issues/23#issuecomment-231004748, or mute the thread https://github.com/notifications/unsubscribe/AHD5ZVXbgxE5hDCY1wDKwdCh3I3fPXbRks5qTKvmgaJpZM4JGM7R.

wwood commented 8 years ago

Hi @fungs, sorry we couldn't be more helpful.

I wonder if you could try (1) removing the contig with zero coverage from the input contigs file, and (2) only using a single thread.

If that doesn't work, then I'm out of ideas too. Thanks for your interest.

fungs commented 8 years ago

I removed the zero coverage contig from the FASTA file and ran with a single thread but it's still giving the same error.

fungs commented 8 years ago

BTW: Why are you using the outdated Python 2.7. Besides bug fixes, no real development is done on this version for years. It will be completely retired in three years.

https://pythonclock.org http://legacy.python.org/dev/peps/pep-0373/

ashfricker commented 7 years ago

Hi, Has this issue been solved? I'm currently running GroopM/0.3.4, and getting the same ValueError.

Thanks in advance!

jzrapp commented 7 years ago

Dear @wwood and @timbalam , I'm receiving the same error message. Is there a fix for this problem now? Or could you maybe explain what the error is trying to tell me? I fell like groopm and I are not speaking the same language. Is this a problem with my input data?

Thanks a lot for your help!

jzrapp commented 7 years ago

The issue was solved after re-running using only 1 thread.

transcript commented 6 years ago

I'm running into this same issue (GroopM 0.3.4, running on only 1 thread). There doesn't seem to be any solution?

wwood commented 6 years ago

Hi, unfortunately this version of GroopM (ie versions <2) is not maintained. I encourage you to try the new version, at https://github.com/timbalam/GroopM - the algorithm for binning changed markedly, so this error shouldn't happen any more.

jvollme commented 5 years ago

It appears that version you are referring to is not the one which gets installed when using "pip install groopm"? Because your fork refers to the installation instructions on the groopm page, which recommend pip, which installs version 0.3.4... (And I also just ran into this exact error again with that version)

jvollme commented 5 years ago

Have the dependencies remained the same ? Because GroopM 0.3.4 required outdated versions of pytables (3.1.1.) & numexpr (2.3.1), and would not install/run with newer ones?