conda-forge / openmpi-feedstock

A conda-smithy repository for openmpi.
BSD 3-Clause "New" or "Revised" License
9 stars 25 forks source link

Upgrading from 4.0.2 to 4.1.3 breaks conda environment #96

Closed jsiirola closed 1 year ago

jsiirola commented 2 years ago

Solution to issue cannot be found in the documentation.

Issue

Upgrading openmpi from 4.0.2 (from pkgs/main) to 4.1.3 (from conda-forge) breaks the conda environment by deleting the bin and etc directories from the environment, rendering mpirun to not be found. A minimal test script:

#!/usr/bin/bash -l
conda create -y -n mpi
conda activate mpi
conda install -y openmpi
CHECK_1=`which mpirun`
conda install -y -c conda-forge openmpi
CHECK_2=`which mpirun`
conda deactivate
conda env remove -n mpi

echo "MPIRUN from pkgs/main: $CHECK_1"
echo "MPIRUN from conda-forge: $CHECK_2"

returns

MPIRUN from pkgs/main: /home/<user>/miniconda3/envs/mpi/bin/mpirun
MPIRUN from conda-forge: 

Installed packages

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             4.5                       1_gnu  
brotlipy                  0.7.0           py38h27cfd23_1003  
ca-certificates           2022.3.29            h06a4308_1  
certifi                   2021.10.8        py38h06a4308_2  
cffi                      1.15.0           py38hd667e15_1  
charset-normalizer        2.0.4              pyhd3eb1b0_0  
colorama                  0.4.4              pyhd3eb1b0_0  
conda                     4.12.0           py38h06a4308_0  
conda-package-handling    1.8.1            py38h7f8727e_0  
cryptography              36.0.0           py38h9ce1e76_0  
idna                      3.3                pyhd3eb1b0_0  
ld_impl_linux-64          2.35.1               h7274673_9  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 9.3.0               h5101ec6_17  
libgomp                   9.3.0               h5101ec6_17  
libstdcxx-ng              9.3.0               hd4cf53a_17  
ncurses                   6.3                  h7f8727e_2  
openssl                   1.1.1n               h7f8727e_0  
pip                       21.2.4           py38h06a4308_0  
pycosat                   0.6.3            py38h7b6447c_1  
pycparser                 2.21               pyhd3eb1b0_0  
pyopenssl                 22.0.0             pyhd3eb1b0_0  
pysocks                   1.7.1            py38h06a4308_0  
python                    3.8.12               h12debd9_0  
readline                  8.1.2                h7f8727e_1  
requests                  2.27.1             pyhd3eb1b0_0  
ruamel_yaml               0.15.100         py38h27cfd23_0  
setuptools                61.2.0           py38h06a4308_0  
sqlite                    3.38.2               hc218d9a_0  
tk                        8.6.11               h1ccaba5_0  
tqdm                      4.63.0             pyhd3eb1b0_0  
urllib3                   1.26.8             pyhd3eb1b0_0  
wheel                     0.37.1             pyhd3eb1b0_0  
xz                        5.2.5                h7b6447c_0  
yaml                      0.2.5                h7b6447c_0  
zlib                      1.2.12               h7f8727e_1

Environment info

active environment : None
            shell level : 0
       user config file : /home/<user>/.condarc
 populated config files : /home/<user>/.condarc
          conda version : 4.12.0
    conda-build version : not installed
         python version : 3.8.12.final.0
       virtual packages : __linux=3.10.0=0
                          __glibc=2.17=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/<user>/miniconda3  (writable)
      conda av data dir : /home/<user>/miniconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/<user>/miniconda3/pkgs
                          /home/<user>/.conda/pkgs
       envs directories : /home/<user>/miniconda3/envs
                          /home/<user>/.conda/envs
               platform : linux-64
             user-agent : conda/4.12.0 requests/2.26.0 CPython/3.8.12 Linux/3.10.0-1160.59.1.el7.x86_64 centos/7.9.2009 glibc/2.17
             netrc file : None
           offline mode : False
jhrmnn commented 2 years ago

Does it perhaps happen because the external build is installed because of gfortran conflict reported in #94?

minrk commented 2 years ago

@jhrmnn that seems likely. Shouldn't the external build be rejected, though?

How do we make sure the external build is not considered unless explicitly requested? I thought this is what track_features was for. Ideally, the solver shouldn't be allowed to prefer the external build as a solution to conflicts.

jhrmnn commented 2 years ago

I ran into this too, btw, my library depends on the newer gfortran, and Conda picked the external build instead. Interestingly, Mamba did try to install the regular 4.1.2. So perhaps track_features does what it's supposed to do, but the Conda installer has a bug?

minrk commented 2 years ago

Maybe it was the special case of only having openmpi, not other packages?

jhrmnn commented 2 years ago

Not sure what you mean by that. Who's having only openmpi?

minrk commented 2 years ago

The repro example above - creates an env with only the openmpi package installed.

jhrmnn commented 2 years ago

I think the repro happens because the openmpi is installed first from the defaults channel, which probably pulls some other dependencies that depend on the new fortran, and when this is then replaced by the conda-forge channel, those dependencies require the new fortran, preventing the loaded 4.1.3 and preferring the external build instead.

Actually the same still happens (even after fixing the fortran issue) when first creating a python env from defaults, conda create -p tmp python, followed by installing openmpi conda install -p tmp -c conda-forge openmpi. When creating everything in one go from conda-forge, conda create -p tmp -c conda-forge python openmpi, it pulls the loaded build correctly.

isuruf commented 2 years ago

We don't support mixing defaults packages and conda-forge packages.

mrmundt commented 2 years ago

I opened a ticket about the same thing here: https://github.com/open-mpi/ompi/issues/10300

We have changed our installation so that it installs from the conda-forge channel, but this still happens. Please feel free to look at the referenced issue above for more details.

isuruf commented 2 years ago

@mrmundt, please open a different issue in this repo with the required information.

mrmundt commented 2 years ago

@isuruf - It's the same issue (@jsiirola and I are both Pyomo developers). I'll add the relevant info here:

Background information

We have a suite of MPI tests for our package Pyomo which utilize a Linux + Conda environment. Two days ago, a new version of openmpi was uploaded to Anaconda, and we are now experiencing what appears to be a symlink breakage.

Previously passing test: https://github.com/Pyomo/pyomo/runs/6032066405?check_suite_focus=true Currently failing test (because mpirun command cannot be found): https://github.com/Pyomo/pyomo/runs/6098156452?check_suite_focus=true

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

Newest version available on Anaconda: linux-64/openmpi-4.1.3-hbea3300_101.tar.bz2

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Through Anaconda: conda install openmpi

Please describe the system on which you are running

The system on which we are running is the ubuntu-latest GitHub Actions runner. Full details can be found here: https://github.com/actions/virtual-environments/blob/main/images/linux/Ubuntu2004-Readme.md


Details of the problem

We have previously run conda install openmpi and been able to run mpirun with no issues in our test suite (linked above). This is the expected behavior - that openmpi out of the box with enable mpirun to be found. Since the most recent update, however, we now get the error mpirun: command not found


UPDATE: We have been digging more into this on our end, and this is actually a particularly strange corner case it seems. When we run conda install openmpi, we are getting the version 4.0.2 from the default conda channel. However, later, we install cyipopt that flags openmpi as a dependency and then installs 4.1.3 from the conda-forge channel, and somehow in that update, the bin gets blown away. See below:

% conda create -n mpi
% conda activate mpi
% conda install openmpi
% ls /home/miniconda3/envs/mpi
bin/ conda-meta/ etc/ include/ lib/ share/
% conda install -c conda-forge openmpi
% ls /home/miniconda3/envs/mpi/
conda-meta/ include/ lib/ share/  # NO MORE BIN

This behavior of over-writing openmpi was not apparent before (and there have been no updates to cyipopt), so not sure why this is happening now.

Previous behaviour: https://github.com/Pyomo/pyomo/runs/6032066405?check_suite_focus=true#step:14:562 Current behaviour: https://github.com/Pyomo/pyomo/runs/6098156452?check_suite_focus=true#step:14:561

isuruf commented 2 years ago

You are still mixing defaults and conda-forge which as I've said before is not a supported use-case for conda-forge.

mrmundt commented 2 years ago

In our current test suite, we do not use the defaults channel:

        conda install -q -y -c conda-forge \
            ${PYTHON_CORE_PKGS} ${PYTHON_PACKAGES} ${CONDA_DEPENDENCIES}
        if test -z "${{matrix.slim}}"; then
            conda install -q -y -c ibmdecisionoptimization 'cplex>=12.10' \
                || echo "WARNING: CPLEX Community Edition is not available"
            conda install -q -y -c gurobi gurobi \
                || echo "WARNING: Gurobi is not available"
            conda install -q -y -c fico-xpress xpress \
                || echo "WARNING: Xpress Community Edition is not available"
            for PKG in cyipopt pymumps; do
                conda install -q -y -c conda-forge $PKG \
                    || echo "WARNING: $PKG is not available"
            done

We do, however, as you can see above, call on other channels for specific solvers which do not exist in the conda-forge channel.

Do we need to explicitly disallow the defaults channel?

isuruf commented 2 years ago

Do we need to explicitly disallow the defaults channel?

Yes. See the README on this repo at https://github.com/conda-forge/openmpi-feedstock#installing-openmpi-mpi

You could also start with miniforge since you are using conda-incubator/miniconda setup.

mrmundt commented 2 years ago

Thanks, @isuruf. That fixed it. We appreciate the help!

dalcinl commented 2 years ago

@isuruf Would it be possible to somehow "patch" the conda pkg manager installed from the conda-forge channel to warn users about misconfigured channel sources? It is not the first time that I see folks struggling with this detail.