conda-forge / openmpi-feedstock

A conda-smithy repository for openmpi.
BSD 3-Clause "New" or "Revised" License
9 stars 22 forks source link

Latest Linux builds have ucx as dependency, increase environment size #102

Closed wm75 closed 2 years ago

wm75 commented 2 years ago

Solution to issue cannot be found in the documentation.

Issue

With the latest linux-64 openmpi builds ucx has started appearing as a dependency, instead of just a constraint:

$conda search openmpi --info -c conda-forge

...

openmpi 4.1.3 h846660c_102
--------------------------
file name   : openmpi-4.1.3-h846660c_102.tar.bz2
name        : openmpi
version     : 4.1.3
build       : h846660c_102
build number: 102
size        : 4.1 MB
license     : BSD-3-Clause
subdir      : linux-64
url         : https://conda.anaconda.org/conda-forge/linux-64/openmpi-4.1.3-h846660c_102.tar.bz2
md5         : b1498a2232331b919a91fe0c23d4704a
timestamp   : 2022-04-26 11:05:00 UTC
constraints : 
  - cudatoolkit  >= 10.2
  - ucx
dependencies: 
  - libgcc-ng >=10.3.0
  - libgfortran-ng
  - libgfortran5 >=10.3.0
  - libstdcxx-ng >=10.3.0
  - libzlib >=1.2.11,<1.3.0a0
  - mpi 1.0 openmpi
  - ucx >=1.12.1,<1.13.0a0
  - zlib >=1.2.11,<1.3.0a0

openmpi 4.1.3 h846660c_103
--------------------------
file name   : openmpi-4.1.3-h846660c_103.tar.bz2
name        : openmpi
version     : 4.1.3
build       : h846660c_103
build number: 103
size        : 4.2 MB
license     : BSD-3-Clause
subdir      : linux-64
url         : https://conda.anaconda.org/conda-forge/linux-64/openmpi-4.1.3-h846660c_103.tar.bz2
md5         : 3c58c19e8f53f54221dc26f6f0cef5c0
timestamp   : 2022-04-27 09:05:39 UTC
constraints : 
  - cudatoolkit  >= 10.2
  - ucx
dependencies: 
  - libgcc-ng >=10.3.0
  - libgfortran-ng
  - libgfortran5 >=10.3.0
  - libstdcxx-ng >=10.3.0
  - libzlib >=1.2.11,<1.3.0a0
  - mpi 1.0 openmpi
  - ucx >=1.12.1,<1.13.0a0
  - zlib >=1.2.11,<1.3.0a0

openmpi 4.1.3 hbea3300_101
--------------------------
file name   : openmpi-4.1.3-hbea3300_101.tar.bz2
name        : openmpi
version     : 4.1.3
build       : hbea3300_101
build number: 101
size        : 4.5 MB
license     : BSD-3-Clause
subdir      : linux-64
url         : https://conda.anaconda.org/conda-forge/linux-64/openmpi-4.1.3-hbea3300_101.tar.bz2
md5         : 6961996e9aa27d91a0d315a932d63d54
timestamp   : 2022-04-18 08:52:16 UTC
constraints : 
  - ucx 1.9.0
  - cudatoolkit >=9.2
dependencies: 
  - libgcc-ng >=7.5.0
  - libgfortran-ng
  - libgfortran4 >=7.5.0
  - libstdcxx-ng >=7.5.0
  - libzlib >=1.2.11,<1.3.0a0
  - mpi 1.0 openmpi
  - zlib >=1.2.11,<1.3.0a0

So ucx was constrained-only in openmpi 4.1.3 hbea3300_101, but is a dependency in openmpi 4.1.3 h846660c_102 and openmpi 4.1.3 h846660c_103. Not sure how/why that happened, but since ucx depends on cudatoolkit this creates an indirect dependency for openmpi, too, and results in substantially bigger environments as noted by users of the pangolin tool for SARS-CoV-2 lineage assignment here: https://github.com/cov-lineages/pangolin/issues/441.

Note: For some even more obscure reason, cudatoolkit gets resolved as a dependency only by mamba, but not by conda.

$ mamba install -c conda-forge openmpi

...

  Package           Version  Build          Channel                    Size
─────────────────────────────────────────────────────────────────────────────
  Install:
─────────────────────────────────────────────────────────────────────────────

  + _libgcc_mutex       0.1  conda_forge    conda-forge/linux-64        3kB
  + _openmp_mutex       4.5  2_gnu          conda-forge/linux-64       24kB
  + cudatoolkit      11.6.0  habf752d_10    conda-forge/linux-64      861MB
  + libgcc-ng        11.2.0  h1d223b6_16    conda-forge/linux-64     Cached
  + libgfortran-ng   11.2.0  h69a702a_16    conda-forge/linux-64       23kB
  + libgfortran5     11.2.0  h5c6108e_16    conda-forge/linux-64     Cached
  + libgomp          11.2.0  h1d223b6_16    conda-forge/linux-64     Cached
  + libstdcxx-ng     11.2.0  he4da1e4_16    conda-forge/linux-64     Cached
  + libzlib          1.2.11  h166bdaf_1014  conda-forge/linux-64     Cached
  + mpi                 1.0  openmpi        conda-forge/linux-64        4kB
  + openmpi           4.1.3  h846660c_103   conda-forge/linux-64     Cached
  + ucx              1.12.1  h7a399c7_1     conda-forge/linux-64       20MB
  + zlib             1.2.11  h166bdaf_1014  conda-forge/linux-64     Cached

  Summary:

  Install: 13 packages

  Total download: 881MB

I don't understand enough about the interdependencies to call this a bug, but it's an inconvenience that would be super nice to have resolved if possible.

Installed packages

# packages in environment at /home/wolma/miniconda3/envs/openmpi-mamba:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
cudatoolkit               11.6.0              habf752d_10    conda-forge
libgcc-ng                 11.2.0              h1d223b6_16    conda-forge
libgfortran-ng            11.2.0              h69a702a_16    conda-forge
libgfortran5              11.2.0              h5c6108e_16    conda-forge
libgomp                   11.2.0              h1d223b6_16    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_16    conda-forge
libzlib                   1.2.11            h166bdaf_1014    conda-forge
mpi                       1.0                     openmpi    conda-forge
openmpi                   4.1.3              h846660c_103    conda-forge
ucx                       1.12.1               h7a399c7_1    conda-forge
zlib                      1.2.11            h166bdaf_1014    conda-forge

Environment info

active environment : openmpi-mamba
    active env location : /home/wolma/miniconda3/envs/openmpi-mamba
            shell level : 1
       user config file : /home/wolma/.condarc
 populated config files : /home/wolma/.condarc
          conda version : 4.12.0
    conda-build version : not installed
         python version : 3.8.13.final.0
       virtual packages : __linux=5.17.4=0
                          __glibc=2.33=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/wolma/miniconda3  (writable)
      conda av data dir : /home/wolma/miniconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/wolma/miniconda3/pkgs
                          /home/wolma/.conda/pkgs
       envs directories : /home/wolma/miniconda3/envs
                          /home/wolma/.conda/envs
               platform : linux-64
             user-agent : conda/4.12.0 requests/2.27.1 CPython/3.8.13 Linux/5.17.4-100.fc34.x86_64 fedora/34 glibc/2.33
                UID:GID : 1000:1000
             netrc file : /home/wolma/.netrc
           offline mode : False
wm75 commented 2 years ago

Ooh oh, #101 got opened in parallel while I was putting this together. Go with whichever version you like best and sorry about that.

corneliusroemer commented 2 years ago

I'll close mine, yours is so much better written! Maybe it's also worth pinging mamba folks about this? Inconsistency between mamba and conda is something they try to avoid I think.

ggouaillardet commented 2 years ago

FWIW, if Open MPI is built with plugins (default in the v4 series), it is possible to put the UCX related plugins (e.g. mca_pml_ucx.so, mca_btl_uct.so and mca_osc_ucx.so) in a separate packages (e.g. openmpi-ucx). So if you do not need for UCX (long story short, you do not run on an infiniband network), you would not have to pull UCX and (indirectly) CUDA.