ablab / spades

SPAdes Genome Assembler
http://ablab.github.io/spades/
Other
737 stars 134 forks source link

Huge difference of speed between 2 computers #148

Open michoug opened 6 years ago

michoug commented 6 years ago

Hi, Not sure if it's a spades specific issue but I was running the same command on two linux computer due to memory issue and oberved a huge difference of speed despite having more memory/cores on the slow one. Here are the beginning of both logs

Fast

Command line: /home/linuxbrew/.linuxbrew/bin/spades.py  --meta  -t      8       -m      110     -1      /home/michoug/testF.fq.gz       -2      /home/michoug/testR.fq.gz       -o      /home/michoug/Test2     

System information:
  SPAdes version: 3.12.0
  Python version: 2.7.13
  OS: Linux-4.13.0-38-generic-x86_64-with-debian-stretch-sid

Output dir: /home/michoug/Test2
Mode: read error correction and assembling
Debug mode is turned OFF

Dataset parameters:
  Metagenomic mode
  Reads:
    Library number: 1, library type: paired-end
      orientation: fr
      left reads: ['/home/michoug/testF.fq.gz']
      right reads: ['/home/michoug/testR.fq.gz']
      interlaced reads: not specified
      single reads: not specified
      merged reads: not specified
Read error correction parameters:
  Iterations: 1
  PHRED offset will be auto-detected
  Corrected reads will be compressed
Assembly parameters:
  k: [21, 33, 55]
  Repeat resolution is enabled
  Mismatch careful mode is turned OFF
  MismatchCorrector will be SKIPPED
  Coverage cutoff is turned OFF
Other parameters:
  Dir for temp files: /home/michoug/Test2/tmp
  Threads: 8
  Memory limit (in Gb): 110

======= SPAdes pipeline started. Log can be found here: /home/michoug/Test2/spades.log

===== Read error correction started. 

== Running read error correction tool: /home/linuxbrew/.linuxbrew/Cellar/spades/3.12.0/bin/spades-hammer /home/michoug/Test2/corrected/configs/config.info

  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  75)   Starting BayesHammer, built from N/A, git revision N/A
  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  76)   Loading config from /home/michoug/Test2/corrected/configs/config.info
  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  78)   Maximum # of threads to use (adjusted due to OMP capabilities): 8
  0:00:00.000     4M / 4M    INFO    General                 (memory_limit.cpp          :  49)   Memory limit set to 110 Gb
  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  86)   Trying to determine PHRED offset
  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  92)   Determined value is 33
  0:00:00.001     4M / 4M    INFO    General                 (hammer_tools.cpp          :  36)   Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ]
  0:00:00.001     4M / 4M    INFO    General                 (main.cpp                  : 113)   Size of aux. kmer data 24 bytes
     === ITERATION 0 begins ===
  0:00:00.001     4M / 4M    INFO   K-mer Counting           (kmer_data.cpp             : 280)   Estimating k-mer count
  0:00:00.068   132M / 132M  INFO   K-mer Counting           (kmer_data.cpp             : 285)   Processing /home/michoug/testF.fq.gz
  0:04:53.507   160M / 160M  INFO   K-mer Counting           (kmer_data.cpp             : 294)   Processed 92183114 reads
  0:04:53.507   160M / 160M  INFO   K-mer Counting           (kmer_data.cpp             : 285)   Processing /home/michoug/testR.fq.gz
  0:09:47.168   160M / 160M  INFO   K-mer Counting           (kmer_data.cpp             : 294)   Processed 184366228 reads

Slow

System information:
  SPAdes version: 3.12.0
  Python version: 2.7.13
  OS: Linux-2.6.32-642.4.2.el6.x86_64-x86_64-with-centos-6.8-Final

Output dir: /home/michoug/AssemblyTara
Mode: read error correction and assembling
Debug mode is turned OFF

Dataset parameters:
  Metagenomic mode
  Reads:
    Library number: 1, library type: paired-end
      orientation: fr
      left reads: ['/home/michoug/testF.fq.gz']
      right reads: ['/home/michoug/testR.fq.gz']
      interlaced reads: not specified
      single reads: not specified
      merged reads: not specified
Read error correction parameters:
  Iterations: 1
  PHRED offset will be auto-detected
  Corrected reads will be compressed
Assembly parameters:
  k: [21, 33, 55]
  Repeat resolution is enabled
  Mismatch careful mode is turned OFF
  MismatchCorrector will be SKIPPED
  Coverage cutoff is turned OFF
Other parameters:
  Dir for temp files: /home/michoug/AssemblyTara/tmp
  Threads: 8
  Memory limit (in Gb): 220

======= SPAdes pipeline started. Log can be found here: /home/michoug/AssemblyTara/spades.log

===== Read error correction started. 

== Running read error correction tool: /home/michoug/.linuxbrew/Cellar/spades/3.12.0/bin/spades-hammer /home/michoug/AssemblyTara/corrected/configs/config.info

  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  75)   Starting BayesHammer, built from N/A, git revision N/A
  0:00:00.018     4M / 4M    INFO    General                 (main.cpp                  :  76)   Loading config from /home/michoug/AssemblyTara/corrected/configs/config.info
  0:00:00.034     4M / 4M    INFO    General                 (main.cpp                  :  78)   Maximum # of threads to use (adjusted due to OMP capabilities): 8
  0:00:00.038     4M / 4M    INFO    General                 (memory_limit.cpp          :  49)   Memory limit set to 220 Gb
  0:00:00.038     4M / 4M    INFO    General                 (main.cpp                  :  86)   Trying to determine PHRED offset
  0:00:00.041     4M / 4M    INFO    General                 (main.cpp                  :  92)   Determined value is 33
  0:00:00.044     4M / 4M    INFO    General                 (hammer_tools.cpp          :  36)   Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ]
  0:00:00.044     4M / 4M    INFO    General                 (main.cpp                  : 113)   Size of aux. kmer data 24 bytes
     === ITERATION 0 begins ===
  0:00:00.044     4M / 4M    INFO   K-mer Counting           (kmer_data.cpp             : 280)   Estimating k-mer count
  0:00:00.862   260M / 260M  INFO   K-mer Counting           (kmer_data.cpp             : 285)   Processing /home/michoug/testF.fq.gz
  2:46:25.984   160M / 160M  INFO   K-mer Counting           (kmer_data.cpp             : 294)   Processed 92183114 reads
  2:46:25.984   160M / 160M  INFO   K-mer Counting           (kmer_data.cpp             : 285)   Processing /home/michoug/testR.fq.gz
  5:17:58.186   160M / 160M  INFO   K-mer Counting           (kmer_data.cpp             : 294)   Processed 184366228 reads

Any ideas what could be causing that ?

asl commented 6 years ago

I would suspect the differences are due to extremely slow I/O