bcgsc / abyss

:microscope: Assemble large genomes using short reads
http://www.bcgsc.ca/platform/bioinfo/software/abyss
Other
310 stars 107 forks source link

abyss-2.0.2 k=156 does not finish contigging (Timeout) #200

Closed mmokrejs closed 6 years ago

mmokrejs commented 6 years ago

Hi, for the last 3-4 weeks I am struggling with attemps to get the fish genome assembly (see issue #187). I received successfully k64, k96, k112, k128 assemblies but although I recompiled abyss with --enable-maxk=256 for second or third time I did not get anything for k156 nor k192. The step one (contigging) never finishes. I yielded only the coverage.hist file (nonempty). The contigging task was for say k128 completed after 25hrs. I let the job for k156 ran 260 hrs wallclock and then ran strace to see what the ABYSS-P and mpirun jobs were doing. The strace revealed both were getting Timeout from the other side. I re-ran the k156 job again and now, after almost 93hrs wallclock, I am loosing my patience again.

The job is running at 104 CPU cores, all of them consume each CPU core fully (just Polling()?).

$ tail -n 20 abyss_ecc.68234.log
8: Loaded 92101148 k-mer.
8: Hash load: 92101148 / 268435456 = 0.343 using 7.22 GB
9: Loaded 92055757 k-mer.
9: Hash load: 92055757 / 268435456 = 0.343 using 7.22 GB
10: Loaded 91597896 k-mer.
10: Hash load: 91597896 / 268435456 = 0.341 using 7.19 GB
4: Loaded 92236433 k-mer.
4: Hash load: 92236433 / 268435456 = 0.344 using 7.23 GB
Minimum k-mer coverage is 7
0: Coverage: 7  Reconstruction: 1293366430
0: Coverage: 5.48   Reconstruction: 1331822195
0: Coverage: 5.39   Reconstruction: 1331822195
Using a coverage threshold of 5...
The median k-mer coverage is 29
The reconstruction is 1331822195
The k-mer coverage threshold is 5.39
Setting parameter e (erode) to 5
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 5.39
Finding adjacenct k-mer...

Would you please introduce some periodical logging during this stage to ensure users abyss is doing something meaningful?

Here is what the mpirun parent job is doing:

$ strace -v -f -p 68269
Process 68269 attached with 5 threads
[pid 68276] select(24, [22 23], NULL, NULL, {1, 360106} <unfinished ...>
[pid 68275] epoll_wait(18,  <unfinished ...>
[pid 68274] select(17, [13 16], NULL, NULL, {1220, 193609} <unfinished ...>
[pid 68273] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 68269] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 68276] <... select resumed> )      = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 68276] select(24, [22 23], NULL, NULL, {2, 0}^CProcess 68269 detached
Process 68273 detached
Process 68274 detached
Process 68275 detached
Process 68276 detached
 <detached ...>

and here is when I connect to one of the ABYSS-P jobs:

$ strace -v -f -p 68552
Process 68552 attached with 3 threads
[pid 68588] epoll_wait(10,  <unfinished ...>
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68586] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid 68552] poll([{fd=5, events=POLLIN}, {fd=15, events=POLLIN}], 2, 0) = 0 (Timeout)
^CProcess 68552 detached
Process 68586 detached
Process 68588 detached

Here are both attempts. The first consumed 1.2TB RAM and ran for 260hrs, the second attempt is same but on error-corrected paired-end reads, which resulted in lower memory footprint (725GB RAM needed). However, even this "second attempt" seems to be broken.

abyss_k156__not_finishing__memory_usage abyss_k156__not_finishing__cpu_usage

What is the maximum ./configure --enable-maxk value? I used openmpi-2.x but also used openmpi-3.0 for the very last attempt (with abyss recompiled afterwards). As you can see, the issue lies somewhere else. Thank you for your clues, and debug patches, most importantly.

mmokrejs commented 6 years ago

The logfile from k128 assembly contained:

...
42: Loaded 184397537 k-mer.
42: Hash load: 184397537 / 268435456 = 0.687 using 7.94 GB
43: Loaded 183281971 k-mer.
43: Hash load: 183281971 / 268435456 = 0.683 using 7.9 GB
44: Loaded 184793758 k-mer.
44: Hash load: 184793758 / 268435456 = 0.688 using 7.96 GB
Minimum k-mer coverage is 9
0: Coverage: 9  Reconstruction: 1283891841
0: Coverage: 5.92       Reconstruction: 1326993365
Using a coverage threshold of 6...
The median k-mer coverage is 35
The reconstruction is 1326993365
The k-mer coverage threshold is 5.92
Setting parameter e (erode) to 6
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 5.92
Finding adjacenct k-mer...
101: Finding adjacent k-mer: 1000000
62: Finding adjacent k-mer: 1000000
85: Finding adjacent k-mer: 1000000
87: Finding adjacent k-mer: 1000000
90: Finding adjacent k-mer: 1000000
71: Finding adjacent k-mer: 1000000
...

so it seems the individual threads did not really start. Damn. I wasted so much CPU time. The contigs were output 2hrs wallclock after the coverage.hist was written, so I really could have realized this much sooner.

sjackman commented 6 years ago

This may be helpful to you: https://github.com/bcgsc/abyss/wiki/ABySS-Users-FAQ#2-my-abyss-assembly-jobs-hang-when-i-run-them-with-high-k-values-eg-k250

benvvalk commented 6 years ago

Thank you, @sjackman, I was going to suggest the same.

mmokrejs commented 6 years ago

Thank you, jobs re-started. Lets see. However, cosider testing programmatically that the mpi processes can actually talk to each other. Nice to learn the FAQ but ...

mmokrejs commented 6 years ago

It doesn't help I think.

$ tail -f abyss_ecc.60633.log
0: Coverage: 5.48   Reconstruction: 1331822195
0: Coverage: 5.39   Reconstruction: 1331822195
Using a coverage threshold of 5...
The median k-mer coverage is 29
The reconstruction is 1331822195
The k-mer coverage threshold is 5.39
Setting parameter e (erode) to 5
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 5.39
Finding adjacenct k-mer...
^C
$ ps -ef |  grep mpirun
mmokrejs  60660  60643  0 Mar08 ?        00:00:20 mpirun --mca btl_sm_eager_limit 16000 --mca btl_openib_eager_limit 16000 -np 104 ABYSS-P -k156 -q3 -v --coverage-hist=coverage.hist -s ...
$

$ strace -v -f -p 60660
Process 60660 attached with 5 threads
[pid 60665] select(17, [13 16], NULL, NULL, {982, 170599} <unfinished ...>
[pid 60664] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 60667] select(24, [22 23], NULL, NULL, {1, 540069} <unfinished ...>
[pid 60666] epoll_wait(18,  <unfinished ...>
[pid 60660] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid 60667] <... select resumed> )      = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}) = 0 (Timeout)
[pid 60667] select(24, [22 23], NULL, NULL, {2, 0}^CProcess 60660 detached
Process 60664 detached
Process 60665 detached
Process 60666 detached
Process 60667 detached
 <detached ...>
mmokrejs commented 6 years ago
$ ompi_info --param  btl  all
                 MCA btl: vader (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA btl: self (MCA v2.1.0, API v3.0.0, Component v3.0.0)
             MCA btl tcp: ---------------------------------------------------
             MCA btl tcp: parameter "btl_tcp_if_include" (current value: "",
                          data source: default, level: 1 user/basic, type:
                          string)
                          Comma-delimited list of devices and/or CIDR
                          notation of networks to use for MPI communication
                          (e.g., "eth0,192.168.0.0/16").  Mutually exclusive
                          with btl_tcp_if_exclude.
             MCA btl tcp: parameter "btl_tcp_if_exclude" (current value:
                          "127.0.0.1/8,sppp", data source: default, level: 1
                          user/basic, type: string)
                          Comma-delimited list of devices and/or CIDR
                          notation of networks to NOT use for MPI
                          communication -- all devices not matching these
                          specifications will be used (e.g.,
                          "eth0,192.168.0.0/16").  If set to a non-default
                          value, it is mutually exclusive with
                          btl_tcp_if_include.
             MCA btl tcp: parameter "btl_tcp_progress_thread" (current value:
                          "0", data source: default, level: 1 user/basic,
                          type: int)
$
$ cat /tmp/pbs.6230016.isrv5/ompi.uv1.3900/pid.60660/contact.txt 
2043871232.0;tcp://172.23.13.1,10.17.13.1,10.16.13.1:37386
60660
$

The addresses translate to bond0, bond1, bond2 interfaces. Why is localhost excluded?

$ ompi_info
                 Package: Open MPI mmokrejs@uv1 Distribution
                Open MPI: 3.0.0
  Open MPI repo revision: v3.0.0
   Open MPI release date: Sep 12, 2017
                Open RTE: 3.0.0
  Open RTE repo revision: v3.0.0
   Open RTE release date: Sep 12, 2017
                    OPAL: 3.0.0
      OPAL repo revision: v3.0.0
       OPAL release date: Sep 12, 2017
                 MPI API: 3.1.0
            Ident string: 3.0.0
                  Prefix: /apps/gentoo/usr
 Configured architecture: x86_64-pc-linux-gnu
          Configure host: uv1
           Configured by: mmokrejs
           Configured on: Sun Mar  4 19:24:26 CET 2018
          Configure host: uv1
  Configure command line: '--prefix=/apps/gentoo/usr'
                          '--build=x86_64-pc-linux-gnu'
                          '--host=x86_64-pc-linux-gnu'
                          '--mandir=/apps/gentoo/usr/share/man'
                          '--infodir=/apps/gentoo/usr/share/info'
                          '--datadir=/apps/gentoo/usr/share'
                          '--sysconfdir=/apps/gentoo/etc'
                          '--localstatedir=/apps/gentoo/var/lib'
                          '--disable-dependency-tracking'
                          '--disable-silent-rules'
                          '--docdir=/apps/gentoo/usr/share/doc/openmpi-3.0.0-r1'
                          '--htmldir=/apps/gentoo/usr/share/doc/openmpi-3.0.0-r1/html'
                          '--libdir=/apps/gentoo/usr/lib64'
                          '--sysconfdir=/apps/gentoo/etc/openmpi'
                          '--enable-pretty-print-stacktrace'
                          '--enable-orterun-prefix-by-default'
                          '--with-hwloc=/apps/gentoo/usr'
                          '--with-libltdl=/apps/gentoo/usr'
                          '--with-libevent=/apps/gentoo/usr'
                          '--enable-mpi-fortran=all' '--enable-mpi-cxx'
                          '--without-cma' '--without-cuda'
                          '--enable-io-romio' '--disable-heterogeneous'
                          '--disable-ipv6' '--enable-java'
                          '--enable-mpi-java' '--without-verbs'
                          '--without-knem' '--without-psm'
                          '--disable-openib-control-hdr-padding'
                          '--disable-openib-rdmacm' '--disable-openib-udcm'
                          '--disable-openib-dynamic-sl' '--without-tm'
                          '--without-slurm'
                Built by: 
                Built on: Sun Mar  4 19:32:26 CET 2018
              Built host: uv1
              C bindings: yes
            C++ bindings: yes
             Fort mpif.h: yes (all)
            Fort use mpi: yes (full: ignore TKR)
       Fort use mpi size: deprecated-ompi-info-value
        Fort use mpi_f08: yes
 Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
                          limitations in the x86_64-pc-linux-gnu-gfortran
                          compiler and/or Open MPI, does not support the
                          following: array subsections, direct passthru
                          (where possible) to underlying Open MPI's C
                          functionality
  Fort mpi_f08 subarrays: no
           Java bindings: yes
  Wrapper compiler rpath: runpath
              C compiler: x86_64-pc-linux-gnu-gcc
     C compiler absolute: /apps/gentoo/usr/bin/x86_64-pc-linux-gnu-gcc
  C compiler family name: GNU
      C compiler version: 6.3.0
            C++ compiler: x86_64-pc-linux-gnu-g++
   C++ compiler absolute: /apps/gentoo/usr/bin/x86_64-pc-linux-gnu-g++
           Fort compiler: x86_64-pc-linux-gnu-gfortran
       Fort compiler abs: /apps/gentoo/usr/bin/x86_64-pc-linux-gnu-gfortran
         Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
   Fort 08 assumed shape: yes
      Fort optional args: yes
          Fort INTERFACE: yes
    Fort ISO_FORTRAN_ENV: yes
       Fort STORAGE_SIZE: yes
      Fort BIND(C) (all): yes
      Fort ISO_C_BINDING: yes
 Fort SUBROUTINE BIND(C): yes
       Fort TYPE,BIND(C): yes
 Fort T,BIND(C,name="a"): yes
            Fort PRIVATE: yes
          Fort PROTECTED: yes
           Fort ABSTRACT: yes
       Fort ASYNCHRONOUS: yes
          Fort PROCEDURE: yes
         Fort USE...ONLY: yes
           Fort C_FUNLOC: yes
 Fort f08 using wrappers: yes
         Fort MPI_SIZEOF: yes
             C profiling: yes
           C++ profiling: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: yes
          C++ exceptions: no
          Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes,
                          OMPI progress: no, ORTE progress: yes, Event lib:
                          yes)
           Sparse Groups: no
  Internal debug support: no
  MPI interface warnings: yes
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
              dl support: yes
   Heterogeneous support: no
 mpirun default --prefix: yes
         MPI I/O support: yes
       MPI_WTIME support: native
     Symbol vis. support: yes
   Host topology support: yes
          MPI extensions: affinity, cuda
   FT Checkpoint support: no (checkpoint thread: no)
   C/R Enabled Debugging: no
  MPI_MAX_PROCESSOR_NAME: 256
    MPI_MAX_ERROR_STRING: 256
     MPI_MAX_OBJECT_NAME: 64
        MPI_MAX_INFO_KEY: 36
        MPI_MAX_INFO_VAL: 256
       MPI_MAX_PORT_NAME: 1024
  MPI_MAX_DATAREP_STRING: 128
           MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v3.0.0)
           MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v3.0.0)
           MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA btl: vader (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA btl: self (MCA v2.1.0, API v3.0.0, Component v3.0.0)
            MCA compress: gzip (MCA v2.1.0, API v2.0.0, Component v3.0.0)
            MCA compress: bzip (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA crs: none (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                  MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA event: external (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA hwloc: external (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                  MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
                  MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
         MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v3.0.0)
         MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v3.0.0)
              MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component v3.0.0)
             MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
                          v3.0.0)
                MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA pmix: flux (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component v3.0.0)
              MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v3.0.0)
               MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA dfs: test (MCA v2.1.0, API v1.0.0, Component v3.0.0)
                 MCA dfs: orted (MCA v2.1.0, API v1.0.0, Component v3.0.0)
                 MCA dfs: app (MCA v2.1.0, API v1.0.0, Component v3.0.0)
              MCA errmgr: dvm (MCA v2.1.0, API v3.0.0, Component v3.0.0)
              MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, Component
                          v3.0.0)
              MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, Component
                          v3.0.0)
              MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component
                          v3.0.0)
              MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component
                          v3.0.0)
                 MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component
                          v3.0.0)
                 MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA ess: env (MCA v2.1.0, API v3.0.0, Component v3.0.0)
               MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v3.0.0)
             MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA iof: orted (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v3.0.0)
            MCA notifier: syslog (MCA v2.1.0, API v1.0.0, Component v3.0.0)
                MCA odls: default (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA rml: oob (MCA v2.1.0, API v3.0.0, Component v3.0.0)
              MCA routed: radix (MCA v2.1.0, API v3.0.0, Component v3.0.0)
              MCA routed: direct (MCA v2.1.0, API v3.0.0, Component v3.0.0)
              MCA routed: debruijn (MCA v2.1.0, API v3.0.0, Component v3.0.0)
              MCA routed: binomial (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component v3.0.0)
              MCA schizo: orte (MCA v2.1.0, API v1.0.0, Component v3.0.0)
              MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v3.0.0)
              MCA schizo: flux (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA state: tool (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA state: orted (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA state: novm (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA state: dvm (MCA v2.1.0, API v1.0.0, Component v3.0.0)
               MCA state: app (MCA v2.1.0, API v1.0.0, Component v3.0.0)
                 MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: self (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: inter (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA coll: basic (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA fcoll: static (MCA v2.1.0, API v2.0.0, Component v3.0.0)
               MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
               MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                  MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                  MCA io: romio314 (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                  MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v3.0.0)
                 MCA pml: v (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
                 MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v3.0.0)
                 MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v3.0.0)
            MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v3.0.0)
            MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
            MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
                MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
                          v3.0.0)
                MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v3.0.0)
           MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
                          v3.0.0)
$

I would admit that openmpi is not configured properly on the host but why wasn't that an issue for k=128 and lower?

sjackman commented 6 years ago

The size of the packet transmitted by ABYSS-P depends on the value of --enable-maxk. The size of the packet must be smaller than the eager transmit limit.

mmokrejs commented 6 years ago

(max_k/4 + 32) 100 = (256/4+32)100 = 9600 , so I conclude 16000 is more than enough even if I went for k=256. I am at k=156 at the moment. What am I missing?

sjackman commented 6 years ago

That wiki entry mentions only shared memory btl_sm_eager_limit and Infiniband btl_openib_eager_limit but not TCP, which is a buggy omission. You would also need to modify btl_tcp_eager_limit if you're communicating over TCP. I believe you're using a single machine though? In which case you'd only need to modify btl_sm_eager_limit.

sjackman commented 6 years ago

I forget how to do this, but I'd suggest checking the default values of all the btl_*_eager_limit configuration parameters on your configuration, to ensure that you've changed all the relevant parameters.

mmokrejs commented 6 years ago

I am on a NUMA host with 112 CPUs, actually 14*8 chunks. Yes, I assume only local communication. I learned openmpi has it own local network interface, said to be better than 127.0.0.1. I saw all the files in /tmp/pbs.6230016.isrv5/ were created when the job executed, so it seems over the time abyss parsed the kmers into meory it did not touch the files in there.

Hmm, I will try to get through the other variables and guess which are relevant. Can I test the fucntionality? The best I managed to come up so far was mpirun ls which got executed 112, good. But some example how to test those large buffers being large enough?

sjackman commented 6 years ago

You need to determine the names of all the btl_*_eager_limit parameters relevant to your system.

mmokrejs commented 6 years ago

ompi_info --param btl all --level 9 gave me a listing of the variables existing.

I will try these (I increased the defaults 4-fold):

--mca btl_tcp_rndv_eager_limit 262144
--mca btl_tcp_eager_limit 262144
--mca btl_tcp_endpoint_cache 122880

--mca btl_vader_fbox_max 256
--mca btl_vader_rndv_eager_limit 131072

--mca btl_self_rndv_eager_limit 524288
--mca btl_self_eager_limit 4096
sjackman commented 6 years ago

I'd suspect that btl_self_eager_limit may be the culprit. https://www.open-mpi.org/faq/?category=tcp https://wiki.hca.bsc.es/wiki/MontBlanc/doku.php?id=mpi_opt

sjackman commented 6 years ago

or possibly btl_vader_rndv_eager_limit, but the default limit is already quite high, 131072. https://www.open-mpi.org/faq/?category=sm#what-is-vader

mmokrejs commented 6 years ago

Sorry for me being unclear, those values were already after multiplication by 4.

sjackman commented 6 years ago

Did --btl_self_eager_limit 16000 help?

benvvalk commented 6 years ago

Thanks for your diligent efforts to help with this @sjackman.

It sounds like this issue will require some significant testing/troubleshooting work on my part.

@mmokrejs The last version of OpenMPI that I can recall testing with is 1.6, which is quite old now I guess. In the absence of any other solution, downgrading to OpenMPI 1.6 has a good chance of solving your issue. (Sorry, I know it is quite annoying to have to do that.)

sjackman commented 6 years ago

I used ABySS MPI 2.0.2 with OpenMPI 2.0.2 (funny coincidence those version numbers) a few months ago, and it worked fine.

benvvalk commented 6 years ago

@sjackman thanks, good to know!

mmokrejs commented 6 years ago

Thanks for all your efforts, so I far I did not get any success. More helpful would be debug output from abyss and ability to detect non-communicating processes. Also, could abyss create a memory snapshot after the k-mer splitting, just before it says Finding adjacenct k-mer...? Well, assuming mpirun is executed so late.

I used first 9998 lines from a paired-end dataset and even this tiny test simply stops, like all previous attempts, with Finding adjacenct k-mer... message being the last one logged. I think I will drop mpi and revert with OpenMP threads, which reminds me you mentioned previously elsewhere that not all step support OpenMP threads. I spent more than a month so it is time to change the approach. The only analysis I am able of doing is running strace -v -f -p $pid to learn those (Timeout) values. Does not seem very advanced way to detect this, though.

I will maybe better look for other MPI implementation, I do not unerstand why OpenMPI cannot detect too small buffers for the communication, and switch into some other mode. True, I am seeing the implementation difficulties of parallelization, but am nevertheless disappointed.

My testcase is as follows:

export mpirun="mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000"

# The mate-pair libraries are used only for scaffolding and do not contribute towards the consensus sequence.
#
# Local MPI uses shared memory to pass messages between processes and is quite fast, and you will benefit from your multiple CPUs.
# Yes, it will be much, much faster. Use the np parameter to specify number of MPI processes for the first (de Bruijn graph) stage and j to specify number of threads for later assembly stages. If ABySS has been successfully compiled with MPI support, you should have a ABYSS-P binary on your PATH.
abyss-pe v=-v np="$threads" j="$threads" k="${kmersize}" name=test_gentoo_"${kmersize}" pe-bam mp-bam default \
lib='HKMHTBCXX.1.pe' \
HKMHTBCXX.1.pe=9998.fasta
mmokrejs commented 6 years ago

And BTW, if I execute the same test without MPI threads (without np=$threads) argument, it finishes in a few seconds.

$ bash -x test_gentoo.sh 
+ [[ hB != hxB ]]
+ XTRACE_STATE=-x
+ [[ hxB != hxB ]]
+ VERBOSE_STATE=+v
+ set +xv
+ unset XTRACE_STATE VERBOSE_STATE
+ export NCPUS=24
+ NCPUS=24
+ export OMP_NUM_THREADS=24
+ OMP_NUM_THREADS=24
+ export threads=24
+ threads=24
+ kmersize=160
+ export 'mpirun=mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000'
+ mpirun='mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000'
+ abyss-pe v=-v j=24 k=160 name=test_gentoo_160 pe-bam mp-bam default lib=HKMHTBCXX.1.pe HKMHTBCXX.1.pe=9998.fasta
ABYSS -k160 -q3 -v   --coverage-hist=coverage.hist -s test_gentoo_160-bubbles.fa  -o test_gentoo_160-1.fa 9998.fasta 
ABySS 2.0.2
ABYSS -k160 -q3 -v --coverage-hist=coverage.hist -s test_gentoo_160-bubbles.fa -o test_gentoo_160-1.fa 9998.fasta
Reading `9998.fasta'...
Read 1986 reads. Hash load: 172620 / 1073741824 = 0.000161 using 372 MB
`9998.fasta': discarded 81 reads shorter than 160 bases
`9998.fasta': discarded 1 reads containing non-ACGT characters
Loaded 172620 k-mer
Hash load: 172620 / 524288 = 0.329 using 22.1 MB
Unable to determine minimum k-mer coverage
Coverage: 0 Reconstruction: 172620
Coverage: 1 Reconstruction: 172620
Using a coverage threshold of 1...
The median k-mer coverage is 1
The reconstruction is 172620
The k-mer coverage threshold is 1
Setting parameter e (erode) to 2
Setting parameter E (erodeStrand) to 0
Setting parameter c (coverage) to 2
Generating adjacency
Added 341316 edges.
Eroding tips
Eroded 171942 tips.
Eroded 0 tips.
Hash load: 678 / 2048 = 0.331 using 21.9 MB
Pruning tips shorter than 1 bp...
Pruning tips shorter than 2 bp...
Pruning tips shorter than 4 bp...
Removed 3 marked k-mer.
Pruned 3 k-mer in 2 tips.
Pruning tips shorter than 8 bp...
Removed 29 marked k-mer.
Pruned 29 k-mer in 8 tips.
Pruning tips shorter than 16 bp...
Removed 34 marked k-mer.
Pruned 34 k-mer in 6 tips.
Pruning tips shorter than 32 bp...
Removed 66 marked k-mer.
Pruned 66 k-mer in 6 tips.
Pruning tips shorter than 64 bp...
Removed 181 marked k-mer.
Pruned 181 k-mer in 8 tips.
Pruning tips shorter than 128 bp...
Removed 365 marked k-mer.
Pruned 365 k-mer in 8 tips.
Pruning tips shorter than 160 bp...
Pruned 38 tips in 8 rounds.
Hash load: 0 / 32 = 0 using 393 kB
Marked 0 edges of 0 ambiguous vertices.
Removing low-coverage contigs (mean k-mer coverage < 2)
Found 0 k-mer in 0 contigs before removing low-coverage contigs.
Removed 0 k-mer in 0 low-coverage contigs.
Split 0 ambigiuous branches.
Hash load: 0 / 32 = 0 using 393 kB
Eroding tips
Eroded 0 tips.
Eroded 0 tips.
Hash load: 0 / 32 = 0 using 393 kB
Pruning tips shorter than 1 bp...
Pruning tips shorter than 2 bp...
Pruning tips shorter than 4 bp...
Pruning tips shorter than 8 bp...
Pruning tips shorter than 16 bp...
Pruning tips shorter than 32 bp...
Pruning tips shorter than 64 bp...
Pruning tips shorter than 128 bp...
Pruning tips shorter than 160 bp...
Pruned 0 tips in 8 rounds.
Hash load: 0 / 32 = 0 using 393 kB
Popping bubbles
Removed 0 bubbles
Marked 0 edges of 0 ambiguous vertices.
Assembled 0 k-mer in 0 contigs.
error: no contigs assembled
make: *** [/apps/gentoo/usr/bin/abyss-pe:504: test_gentoo_160-1.fa] Error 1
make: *** Deleting file 'test_gentoo_160-1.fa'
sjackman commented 6 years ago

I'm sorry to hear that you weren't able to get OpenMPI working. It can be troublesome, which is one reason that we're migrating away from it.

mmokrejs commented 6 years ago

Seems --mca btl tcp,self helps in my smaller testing runs. I asked local admins to trace down what is causing exactly the problem with blocked communication.

Here is the full command I used:

export mpirun="mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000 --mca btl tcp,self"

Let's see if the full genome now comes out with k=160.

Would you please tell me what is mpirun doing once the abyss-pe is started and when those 2 workers are splitting input reads into kmers? I mean, what is going on until Finding adjacenct k-mer is output? Is it communication through openmpi? How does that differ from the moment when Finding adjacenct k-mer is logged?

Why am I asking? Let's assume openmpi was misconfigured with small buffers so that the communication was stuck. But, this misconfiguration did not cause any issues in initial phase while k-mers were being split. Is ABYSS-P not using so large messages during the k-mer splitting procedure? Or what is the difference?

Now, after the --mca btl tcp,self addition the kernel has a lot of troubles according to monitoring tools, for example htop reports 75-85% of every CPU core is utilized by system (not by user's application). I wonder why is it causing so much issues if this initial step was always working, seemingly not depending on openmpi or at least, did not need the enlarged buffers.

Note, the supposedly short buffers blocked the communication later, just after the Finding adjacenct k-mer is logged. The below figure of CPU usage from ganglia shows what I said just above:

openmpi_tcp_self

The abyss job uses 104 cores, only, and is stil just splitting k-mers into memory:

abyss-pe np=104 j=104 ...
...
0: Read 39400000 reads. 0: Hash load: 20767359 / 268435456 = 0.0774 using 1.93 GB
1: Read 38900000 reads. 1: Hash load: 20554077 / 268435456 = 0.0766 using 1.92 GB

The system (112 phys. cores, NUMA arch) is doing a lot of context switches, true:

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
104  0  21080 3223616256 491280 44062892    0    0     1     2    0    0 68  1 31  0  0
104  0  21080 3223576832 491280 44062924    0    0     0     0 116029  612  6 87  7  0  0
104  0  21080 3223536896 491280 44062920    0    0     0    20 116115  692  6 87  7  0  0
104  0  21080 3223496960 491280 44062920    0    0     0     0 116379  614  6 87  7  0  0
104  0  21080 3223457280 491280 44062944    0    0     0    66 116474 1027  6 87  7  0  0
104  0  21080 3223417344 491280 44063116    0    0     0     0 116622  618  6 87  7  0  0
104  0  21080 3223377664 491280 44063116    0    0     0     0 115702  627  6 87  7  0  0
104  0  21080 3223338240 491280 44063116    0    0     0    26 116403  721  6 87  7  0  0
104  0  21080 3223298304 491280 44063116    0    0     0     0 116241  813  6 87  7  0  0
104  0  21080 3223258880 491280 44063116    0    0     0    25 115906  668  6 87  7  0  0
104  0  21080 3223217152 491280 44063072    0    0     0     3 115555  824  6 87  7  0  0
104  0  21080 3223179776 491280 44063072    0    0     0     0 116328  611  6 87  7  0  0
104  0  21080 3223139840 491280 44063072    0    0     0    24 116217  682  6 87  7  0  0
104  0  21080 3223099648 491280 44063072    0    0     0     0 116705  615  6 87  7  0  0
104  0  21080 3223060480 491280 44063012    0    0     0     0 116117  626  6 87  7  0  0
104  0  21080 3223020288 491280 44063004    0    0     0     2 115992  632  6 87  7  0  0
104  0  21080 3222980608 491280 44062996    0    0     0     0 116363  661  6 87  7  0  0
104  0  21080 3222942208 491280 44063008    0    0     0     2 116088 1048  6 87  7  0  0
104  0  21080 3222902016 491280 44062988    0    0     0     0 116581  614  6 87  7  0  0
105  0  21080 3222860544 491280 44062988    0    0     0     0 115048  616  6 88  7  0  0
sjackman commented 6 years ago

Perhaps try --mca btl sm if you're only using a single machine, or if that doesn't work --mca btl tcp. I don't believe I've used --mca btl self, and it's possible that for some reason it doesn't work with ABYSS-P.

sjackman commented 6 years ago

Back when I wrote ABYSS-P OpenMPI used only spin locks to poll for I/O, and it would never block on I/O. The high SYSTEM usage indicates to me that it's blocking on I/O. That's not necessarily in itself wrong, it's just different than we originally designed ABYSS-P. It's possible the new I/O scheme is somehow incompatible with ABYSS-P.

sjackman commented 6 years ago

During loading ABYSS-P is sending SeqAddMessage MT_ADD messages. https://github.com/bcgsc/abyss/blob/fbff0e2c487e443c3610d34e64308fde69a1e861/Parallel/Messages.h#L66-L67

During adjacency it's sending SetBaseMessage MT_SET_BASE messages, which are two bytes larger. I don't expect that small size difference is the cause of the problem. https://github.com/bcgsc/abyss/blob/fbff0e2c487e443c3610d34e64308fde69a1e861/Parallel/Messages.h#L188-L189

You can set increase the verbosity level to 9 to see every time a message is queued with abyss-pe v=-vvvvvvvvv or ABYSS-P -vvvvvvvvv https://github.com/bcgsc/abyss/blob/fbff0e2c487e443c3610d34e64308fde69a1e861/Parallel/MessageBuffer.cpp#L68-L69

You can try adding extra debugging information to the functions sendBufferedMessage and receiveBufferedMessage. https://github.com/bcgsc/abyss/blob/fbff0e2c487e443c3610d34e64308fde69a1e861/Parallel/CommLayer.cpp#L199-L208

By the design of ABYSS-P, MPI_Send and MPI_Irecv should never block. If they do, something's gone wrong. https://github.com/bcgsc/abyss/blob/fbff0e2c487e443c3610d34e64308fde69a1e861/Parallel/CommLayer.cpp#L203-L204 https://github.com/bcgsc/abyss/blob/fbff0e2c487e443c3610d34e64308fde69a1e861/Parallel/CommLayer.cpp#L265-L267

mmokrejs commented 6 years ago

Perhaps try --mca btl sm if you're only using a single machine

$ export mpirun="mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000 --mca btl sm"
$ abyss-pe v=-v np="$threads" j="$threads" k="${kmersize}" name=test_gentoo_"${kmersize}" pe-bam mp-bam default \
> lib='HKMHTBCXX.1.pe' \
> HKMHTBCXX.1.pe=9998.fasta
mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000 --mca btl sm -np 24 ABYSS-P -k160 -q3 -v   --coverage-hist=coverage.hist -s test_gentoo_160-bubbles.fa  -o test_gentoo_160-1.fa 9998.fasta 
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[32318,1],15]) is on host: login4
  Process 2 ([[32318,1],15]) is on host: login4
  BTLs attempted: sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_INIT has failed because at least one MPI process is unreachable
from another.  This *usually* means that an underlying communication
plugin -- such as a BTL or an MTL -- has either not loaded or not
allowed itself to be used.  Your MPI job will now abort.

You may wish to try to narrow down the problem;

 * Check the output of ompi_info to see which BTL/MTL plugins are
   available.
 * Run your application with MPI_THREAD_SINGLE.
 * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
   if using MTL-based communications) to see exactly which
   communication plugins were considered and/or discarded.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
...
[login4:28887] 23 more processes have sent help message help-mca-bml-r2.txt / unreachable proc
[login4:28887] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[login4:28887] 23 more processes have sent help message help-mpi-runtime.txt / mpi_init:startup:pml-add-procs-fail
make: *** [/apps/gentoo/usr/bin/abyss-pe:501: test_gentoo_160-1.fa] Error 1
$

Ok, adding self works on the mini-tescase at least.

$ export mpirun="mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000 --mca btl sm,self"
mmokrejs commented 6 years ago

Back when I wrote ABYSS-P OpenMPI used only spin locks to poll for I/O, and it would never block on I/O. The high SYSTEM usage indicates to me that it's blocking on I/O. That's not necessarily in itself wrong, it's just different than we originally designed ABYSS-P. It's possible the new I/O scheme is somehow incompatible with ABYSS-P.

Here the extra detailed output from abyss-pe. Those barriers are output by openmpi or by abyss?

3 to 18: TCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGTGCACGCATTGAAC
3 to 10: TCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGTGCACGCATTGAAG
3 to 18: TCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGTGCACGCATTGAAT
3 to 20: AATCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGTGCACGCATTGA
3 to 12: CATCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGTGCACGCATTGA
3 to 20: GATCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGTGCACGCATTGA
3 to 12: TATCATGAGGAATTTACATTTTTTGGTGAACTATCACTTTAAAGGTGACATAGAATGATAGGGAATTAAGGGAATTGAATGAGAGTGGGGGGGGGCAGTGAGATGCATTTTTACGTAATATGTGCACTTCATCTGTGCAAAGATGAAGAAAGCACCATTTTAAAGAGTTGGTTCAAAACGCTGCAGCCAAATTGGGGT
20 to 6: GTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAGAA
20 to 22: GTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAGAC
20 to 6: GTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAGAG
20 to 22: GTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAGAT
20 to 7: ATGTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAG
20 to 23: CTGTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAG
20 to 7: GTGTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAG
20 to 23: TTGTTCCAGCATGATCGACCTTTGCGTTGTAGATCGTAAGAGTGGAGGACGTCTCGTCGGTACGGGTCACGCTGATGTCCGGCCTGCCAGGCTCGATCTTCTCCCCGCTTGGAGAGAACCAGTCGATTTCCTTTACCCCACCGATAACTGCAACCCAAAG
20 to 4: TTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAATAA
20 to 12: TTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAATAC
20 to 4: TTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAATAG
20 to 12: TTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAATAT
20 to 14: ATTTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAAT
20 to 22: CTTTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAAT
20 to 14: GTTTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAAT
20 to 22: TTTTTTTTTTTTCACCAAACAAAGACTTAATCATCATGTTCTATTGATTTTAATGCTATTTGCATGAAACTAAACCCTTGGGAAAAATTATGATAGATTTATTTATAAATGAGTGAACAACGCGAATCCAGGAATTAATTTATTTCTTTATTTAATTAAT
20 to 11: ACACTTACACTGTAGCATAATGAGGGTCTCTATGTGTAAACCGAAGCATTGAGAACTTTGCAAGTGTACAGGCAGTTTATTAAAAAGAAACACAGTACCGTTCACGTCTGGTTCACATATCCATGCGGGCGCCATCTTGGGAAAACTGAACTTTCGCGTA
20 to 3: ACACTTACACTGTAGCATAATGAGGGTCTCTATGTGTAAACCGAAGCATTGAGAACTTTGCAAGTGTACAGGCAGTTTATTAAAAAGAAACACAGTACCGTTCACGTCTGGTTCACATATCCATGCGGGCGCCATCTTGGGAAAACTGAACTTTCGCGTC
20 to 11: ACACTTACACTGTAGCATAATGAGGGTCTCTATGTGTAAACCGAAGCATTGAGAACTTTGCAAGTGTACAGGCAGTTTATTAAAAAGAAACACAGTACCGTTCACGTCTGGTTCACATATCCATGCGGGCGCCATCTTGGGAAAACTGAACTTTCGCGTG
20 to 3: ACACTTACACTGTAGCATAATGAGGGTCTCTATGTGTAAACCGAAGCATTGAGAACTTTGCAAGTGTACAGGCAGTTTATTAAAAAGAAACACAGTACCGTTCACGTCTGGTTCACATATCCATGCGGGCGCCATCTTGGGAAAACTGAACTTTCGCGTT
20 to 12: CCACACTTACACTGTAGCATAATGAGGGTCTCTATGTGTAAACCGAAGCATTGAGAACTTTGCAAGTGTACAGGCAGTTTATTAAAAAGAAACACAGTACCGTTCACGTCTGGTTCACATATCCATGCGGGCGCCATCTTGGGAAAACTGAACTTTCGCG
20 to 12: TCACACTTACACTGTAGCATAATGAGGGTCTCTATGTGTAAACCGAAGCATTGAGAACTTTGCAAGTGTACAGGCAGTTTATTAAAAAGAAACACAGTACCGTTCACGTCTGGTTCACATATCCATGCGGGCGCCATCTTGGGAAAACTGAACTTTCGCG
20 to 23: ATGGAGAGATACACACCACAAATTCCTCCGTCCTCTTCTGCTGTCTCCTACAGGGAACCACAGCTAGATAAACATAACAATTAATCCTTACTTCGGTCTCCGTTCCTCCAAAGCCTCAACAATACTTGCAGATTCTTAACTCCTTCCAACTTCCAACGTA
20 to 15: ATGGAGAGATACACACCACAAATTCCTCCGTCCTCTTCTGCTGTCTCCTACAGGGAACCACAGCTAGATAAACATAACAATTAATCCTTACTTCGGTCTCCGTTCCTCCUnable to determine minimum k-mer coverage
0: Coverage: 0  Reconstruction: 172620
0: Coverage: 1  Reconstruction: 172620
Using a coverage threshold of 1...
The median k-mer coverage is 1
The reconstruction is 172620
The k-mer coverage threshold is 1
Setting parameter e (erode) to 2
Setting parameter E (erodeStrand) to 0
Setting parameter c (coverage) to 2
0: SetState 2 (was 1)
Finding adjacenct k-mer...
0: entering barrier
0: left barrier
0 to 19: TCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTTAA
0 to 3: TCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTTAC
0 to 19: TCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTTAG
0 to 3: TCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTTAT
0 to 10: AATCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTT
0 to 18: CATCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTT
0 to 10: GATCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTT
0 to 18: TATCACTAATATCTTCGTAACCGTAACTCCTATCGATTCGAAAACACCATAGGAAGAACAATACAATAGCTTCTGAACAAAAAACTCTATTGGCGTTATATTAAAATTGCGATCAGAAATGGCCGAAAATTGCAAAAAACCAAAATTTACTTACATTGTT
0 to 22: CTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGAAA
0 to 6: CTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGAAC
0 to 22: CTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGAAG
0 to 6: CTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGAAT
0 to 19: ATCTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGA
0 to 11: CTCTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGA
0 to 19: GTCTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGA
0 to 11: TTCTGATTCCTTTTATGTCTTTTATAAACATTGTTAGATGTAACAATCACAAACACCAGCAATAAAAATCCAGCAACAGCACCAAGACCAACTAAGACATTTTTAAGACTTGACTCTGAAGCATCTAAAAAAAACAGCACACTCATTGTAAATGACCTGA
0 to 6: CATCATTCTAAAGTAATTACTAATTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAATGACTGAATTGTTACTACATCATTCTAAAGTAATTACAATTTTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAA
0 to 14: CATCATTCTAAAGTAATTACTAATTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAATGACTGAATTGTTACTACATCATTCTAAAGTAATTACAATTTTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAC
0 to 6: CATCATTCTAAAGTAATTACTAATTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAATGACTGAATTGTTACTACATCATTCTAAAGTAATTACAATTTTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAG
0 to 14: CATCATTCTAAAGTAATTACTAATTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAATGACTGAATTGTTACTACATCATTCTAAAGTAATTACAATTTTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAT
0 to 1: AACATCATTCTAAAGTAATTACTAATTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAATGACTGAATTGTTACTACATCATTCTAAAGTAATTACAATTTTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGT
0 to 9: CACATCATTCTAAAGTAATTACTAATTTTTGGATCAGTATTCTAAAGTGAAGATCATGGTAACCCACTAGTAATGACTGAATT10: SetState 2 (was 18)
10: entering barrier
10: left barrier
10 to 12: CCCAACCTTTATTAAACAGGAGGAAAATGATTACATTTAGGTTGATGTCAGTCTGCTGATAAAGCAAGCGATGATAACAAACACGGCATTCACTGTTAAATGTGCACTAGCAGTGCAAATGTCATGGGTTTGATCCCCAGGGAACACACATACTGATAAA

I suspect some of the messages were sent to STDOUT and some to STDERR, hence the k-mers appear before and after the Finding adjacenct k-mer line, which sounds weird.

mmokrejs commented 6 years ago

So, at least I can confirm that with

export mpirun="mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000 --mca btl tcp,self"

the computation transited from k-mer counting to Finding adjacent k-mer. The counting step took about twice as long as with k=128 and with the default openmpi-3.0.0 settings.

Minimum k-mer coverage is 7
0: Coverage: 7  Reconstruction: 1293093188
0: Coverage: 5.39       Reconstruction: 1333245901
0: Coverage: 5.29       Reconstruction: 1333245901
Using a coverage threshold of 5...
The median k-mer coverage is 28
The reconstruction is 1333245901
The k-mer coverage threshold is 5.29
Setting parameter e (erode) to 5
Setting parameter E (erodeStrand) to 1
Setting parameter c (coverage) to 5.29
Finding adjacenct k-mer...
91: Finding adjacent k-mer: 1000000
103: Finding adjacent k-mer: 1000000
90: Finding adjacent k-mer: 1000000
39: Finding adjacent k-mer: 1000000

Although the system is being tortured due to the deemed blocking IO it works.

adjacency_memory_usage

adjacency_cpu_usage

The increase in nice usage from 5.2% to 16.4% is when k-mer was finished and adjacency step started.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
104  0  21080 2498251520 491608 110947648    0    0     1     2    0    0 65  5 30  0  0
104  0  21080 2498251520 491608 110947648    0    0     0     0 113557  645 16 76  7  0  0
104  0  21080 2498248960 491608 110947648    0    0     0    17 113510  680 16 77  7  0  0
104  0  21080 2498248960 491608 110947648    0    0     0     3 113373  637 16 76  7  0  0
104  0  21080 2498246912 491608 110947648    0    0     0     0 113919  676 16 77  7  0  0
104  0  21080 2498248960 491608 110947656    0    0     0    18 113906  777 17 76  7  0  0
104  0  21080 2498244608 491608 110947664    0    0     0     0 113872  628 16 77  7  0  0
104  0  21080 2498240768 491608 110947824    0    0     0     0 114136  844 16 77  7  0  0
104  0  21080 2498238208 491608 110947824    0    0     0     2 113686  638 16 77  7  0  0
104  0  21080 2498230272 491608 110947824    0    0     0     0 113487  653 16 76  7  0  0
104  0  21080 2498224640 491608 110947824    0    0     0    26 113899  720 16 77  7  0  0
104  0  21080 2498227968 491608 110947824    0    0     0     0 113775  640 16 76  7  0  0
104  0  21080 2498224128 491608 110947776    0    0     0     0 114253  667 16 77  7  0  0
104  0  21080 2498223616 491608 110947776    0    0     0     0 113717  628 16 76  7  0  0
104  0  21080 2498215680 491608 110947776    0    0     0     0 113636  636 16 76  7  0  0
104  0  21080 2498207744 491608 110947768    0    0     0    27 114019  782 16 76  7  0  0
104  0  21080 2498208512 491608 110947744    0    0     0     3 113654  636 16 77  7  0  0
mmokrejs commented 6 years ago

Currently Eroding tips:

eroding_tips

mmokrejs commented 6 years ago

Currently Sweeping (after Pruned xx tips, hoping to get contigs out somewhat soon):

sweeping

sjackman commented 6 years ago

Here the extra detailed output from abyss-pe. Those barriers are output by openmpi or by abyss?

All logging is printed by ABySS, including barrier messages.

sjackman commented 6 years ago

Looks like it's working! Is this with --mca btl tcp,self? Did --mca btl self,sm not work? sm would likely perform better than tcp.

mmokrejs commented 6 years ago

Is this with --mca btl tcp,self?

Yes.

Did --mca btl self,sm not work? sm would likely perform better than tcp.

I started the above process earlier than we got to --mca btl self,sm in a minimalistic testing example in https://github.com/bcgsc/abyss/issues/200#issuecomment-373106439 . I will try it once this job finishes. Provided the mini-test went through it should also work with real data. Or maybe I will kill the running job after I get contigs out from phase nr. 1.

sjackman commented 6 years ago

What was the change that fixed it?

mmokrejs commented 6 years ago

The "defaults" worked for k-mer size <=128 but did not for k=156, k=160, k=192. I do not know how to determine what openmpi decided to do automagically, nor why the issue did not demonstrate for shorter k-mer sizes with "wrong defaults".

sjackman commented 6 years ago

Do you have larger values of k working yet, or only k ≤ 128?

mmokrejs commented 6 years ago

Do you have larger values of k working yet, or only k ≤ 128?

Only 64, 96, 112, 128 worked. That binary was compiled with ./configure --enable-maxk=128.

Since I created (repeatedly against various openmpi/boost versions) binary using ./configure --enable-maxk=256 I cannot get past step Finding adjacenct k-mer for 156, 160, 192 (smaller values untested).

I think the problem lies in autodetection inside openmpi, especially if you said a 2byte difference in lengths of communicated messages. The "two types of machines" I worked on have several bond and several eth and several infiniband interfaces. Could k-mer splitting step proceed up to Finding adjacenct k-mer with non-functional communication unnoticed? From my brief reading of the numbers the logged numbers made sense.

sjackman commented 6 years ago

Unlikely I think that it could progress without the messages being received, unless they all just ended up in some MPI buffer somewhere without actually being received, but I'd expect that buffer to fill up and eventually explode in that case.

mmokrejs commented 6 years ago

I think I will leave the task with the suboptimal MPI performance running to the end, as supposedly all latter steps should now be only using OpenMP threads (as explained in https://github.com/bcgsc/abyss/issues/185).

Now, after having:

-rw-rwx---  1 mmokrejs mmokrejs       6076 Mar 14 12:20 abyss_ecc.sh
-rw-rw----  1 mmokrejs mmokrejs     390871 Mar 15 11:43 coverage.hist
-rw-rw----  1 mmokrejs mmokrejs 2212532349 Mar 16 02:56 tt_16D1C3L12__abyss_160-1.fa
-rw-rw----  1 mmokrejs mmokrejs  164360973 Mar 16 02:56 tt_16D1C3L12__abyss_160-bubbles.fa
-rw-rw----  1 mmokrejs mmokrejs          0 Mar 16 02:58 tt_16D1C3L12__abyss_160-1.path
-rw-rw----  1 mmokrejs mmokrejs  682753369 Mar 16 02:58 tt_16D1C3L12__abyss_160-1.dot
-rw-rw----  1 mmokrejs mmokrejs  463028950 Mar 16 03:02 tt_16D1C3L12__abyss_160-2.dot1
-rw-rw----  1 mmokrejs mmokrejs 1670393734 Mar 16 03:03 tt_16D1C3L12__abyss_160-2.fa
-rw-rw----  1 mmokrejs mmokrejs  463028950 Mar 16 03:03 tt_16D1C3L12__abyss_160-2.dot
-rw-rw----  1 mmokrejs mmokrejs  426686602 Mar 16 03:07 tt_16D1C3L12__abyss_160-3.dot
-rw-rw----  1 mmokrejs mmokrejs    3777547 Mar 16 03:07 tt_16D1C3L12__abyss_160-2.path
-rw-rw----  1 mmokrejs mmokrejs 1590418021 Mar 16 03:08 tt_16D1C3L12__abyss_160-3.fa
-rw-rw----  1 mmokrejs mmokrejs   45470128 Mar 16 03:08 tt_16D1C3L12__abyss_160-indel.fa
-rw-rw----  1 mmokrejs mmokrejs      32452 Mar 16 05:28 HKMHTBCXX.1.pe-3.hist
-rw-rw----  1 mmokrejs mmokrejs 2222265369 Mar 16 05:40 HKMHTBCXX.1.pe-3.bam
-rw-rw----  1 mmokrejs mmokrejs  188466360 Mar 16 05:40 HKMHTBCXX.1.pe-3.bam.bai

I see abyss-pe now likely runs in OpenMP mode. Can't say about the future steps.

after_step3__backmapping__cpu_usage after_step3__backmapping__memory_usage

sjackman commented 6 years ago

I see abyss-pe now likely runs in OpenMP mode. Can't say about the future steps.

All steps following ABYSS-P are either multithreaded with OpenMP or single-threaded, if they're not a bottle-neck.

mmokrejs commented 6 years ago

Thanks, I forgot to drop this line from the comment, as I later bothered to dig and re-read the closed issue https://github.com/bcgsc/abyss/issues/185 . Thanks for confirming. ;-) k192 is scheduled with

export mpirun="mpirun --verbose --mca btl_sm_eager_limit 32000 --mca btl_openib_eager_limit 32000 --mca btl_tcp_rndv_eager_limit 524288 --mca btl_tcp_eager_limit 524288 --mca btl_tcp_endpoint_cache 122880 --mca btl_vader_fbox_max 256 --mca btl_vader_rndv_eager_limit 131072 --mca btl_self_rndv_eager_limit 524288 --mca btl_self_eager_limit 32000 --mca btl sm,self"

mmokrejs commented 6 years ago

And, my theory now is that

$ ifconfig lo
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 0  (Local Loopback)
        RX packets 16649790061  bytes 10924065556881 (9.9 TiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16649790061  bytes 10924065556881 (9.9 TiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

dampens the mpirun --mca btl tcp,self performance due to txqueuelen 0. The counters were probably not exactly zero but provided it was my first atempt with mpirun --mca btl tcp,self let's say most of this 2x10TB traffic was just during this abyss-pe run (the machine was recently rebooted, incidentally).

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.