anaconda / anaconda-project

Tool for encapsulating, running, and reproducing data science projects
https://anaconda-project.readthedocs.io/en/latest/
Other
216 stars 88 forks source link

[BUG] CONDA_SUBDIR difficulties. #392

Open mforbes opened 1 year ago

mforbes commented 1 year ago

ALL software version info

$ anaconda-project --version
0.11.1
$ conda --version
conda 23.1.0
$ uname -a
Darwin Admins-MacBook-Pro-2.local 21.6.0 Darwin Kernel Version 21.6.0: Mon Dec 19 20:43:09 PST 2022; root:xnu-8020.240.18~2/RELEASE_ARM64_T6000 arm64

Description of expected behavior and the observed behavior

I recently started working on a Mac M1 Max with an ARM processor. Generally things work well, but occasionally I need to create a project that uses the Rosetta and the osx_64 architecture (pyFFTW is a current issue but there may be others). To do this, one typically sets the environmental variable CONDA_SUBDIR=osx-64 before running conda. (Btw, this is very poorly documented. It is not even listed in the docs.)

The problem is that I would like users to use anaconda-project run shell or similar to run programs, but unless the user sets the variable first, CONDA_SUBDIR=osx-64 anaconda-project run shell, there can be subtle issues, one of which I demonstrate here.

I tried setting the variable in appropriate variables: sections of the anaconda-project.yaml file, but these do not take affect until after the environment is active, which still causes problems related to environment resolution.

This specific issue is a bit artificial, and may indicate another underlying problem with resolution, but I think that it might be more robust if the variables in the variables: section are set first before conda is run, or perhaps another special section pre-variables: or something could be used (for backward compatibility).

Relevant configuration files

anaconda-project.yaml ```yaml # anaconda-project.yaml name: arm_vs_x86 description: Demonstrate ARM and Intel issues. commands: shell: unix: bash --init-file /dev/null env_spec: x86 init: unix: | pip install matplotlib conda config --env --set subdir osx-64 env_spec: x86 variables: CONDA_EXE: conda CONDA_SUBDIR: osx-64 services: {} downloads: {} dependencies: - python=3.9 - conda-forge::pyfftw channels: - defaults platforms: - osx-64 env_specs: arm: description: Default environment channels: [] x86: description: Rosetta intel environment for ARM processors variables: CONDA_SUBDIR: osx-64 ```

Minimal non-Working Example

CONDA_SUBDIR=osx-64 anaconda-project run init
anaconda-project run shell
python -c "import ssl"

Command output and/or screenshots of the bug in action

CONDA_SUBDIR=osx-64 anaconda-project run init ```bash $ CONDA_SUBDIR=osx-64 anaconda-project run init Collecting package metadata (current_repodata.json): ...working... done Solving environment: ...working... done ## Package Plan ## environment location: /Users/mforbes/tmp/python/arm/envs/x86 added / updated specs: - conda-forge::pyfftw - python=3.9 The following NEW packages will be INSTALLED: blas pkgs/main/osx-64::blas-1.0-mkl ca-certificates pkgs/main/osx-64::ca-certificates-2023.01.10-hecd8cb5_0 certifi pkgs/main/osx-64::certifi-2022.12.7-py39hecd8cb5_0 intel-openmp pkgs/main/osx-64::intel-openmp-2021.4.0-hecd8cb5_3538 libcxx pkgs/main/osx-64::libcxx-14.0.6-h9765a3e_0 libffi pkgs/main/osx-64::libffi-3.4.2-hecd8cb5_6 mkl pkgs/main/osx-64::mkl-2021.4.0-hecd8cb5_637 mkl-service pkgs/main/osx-64::mkl-service-2.4.0-py39h9ed2024_0 mkl_fft pkgs/main/osx-64::mkl_fft-1.3.1-py39h4ab4a9b_0 mkl_random pkgs/main/osx-64::mkl_random-1.2.2-py39hb2f4e1b_0 ncurses pkgs/main/osx-64::ncurses-6.4-hcec6c5f_0 numpy pkgs/main/osx-64::numpy-1.23.5-py39he696674_0 numpy-base pkgs/main/osx-64::numpy-base-1.23.5-py39h9cd3388_0 openssl pkgs/main/osx-64::openssl-1.1.1s-hca72f7f_0 pip pkgs/main/osx-64::pip-22.3.1-py39hecd8cb5_0 pyfftw conda-forge/osx-64::pyfftw-0.13.1-py39h7cc1f47_0 python pkgs/main/osx-64::python-3.9.16-h218abb5_0 python_abi conda-forge/osx-64::python_abi-3.9-2_cp39 readline pkgs/main/osx-64::readline-8.2-hca72f7f_0 setuptools pkgs/main/osx-64::setuptools-65.6.3-py39hecd8cb5_0 six pkgs/main/noarch::six-1.16.0-pyhd3eb1b0_1 sqlite pkgs/main/osx-64::sqlite-3.40.1-h880c91c_0 tk pkgs/main/osx-64::tk-8.6.12-h5d9f67b_0 tzdata pkgs/main/noarch::tzdata-2022g-h04d1e81_0 wheel pkgs/main/noarch::wheel-0.37.1-pyhd3eb1b0_0 xz pkgs/main/osx-64::xz-5.2.10-h6c40b1e_1 zlib pkgs/main/osx-64::zlib-1.2.13-h4dc903c_0 Downloading and Extracting Packages Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done # # To activate this environment, use # # $ conda activate /Users/mforbes/tmp/python/arm/envs/x86 # # To deactivate an active environment, use # # $ conda deactivate Collecting matplotlib Using cached matplotlib-3.6.3-cp39-cp39-macosx_10_12_x86_64.whl (7.3 MB) Collecting contourpy>=1.0.1 Using cached contourpy-1.0.7-cp39-cp39-macosx_10_9_x86_64.whl (244 kB) Collecting packaging>=20.0 Using cached packaging-23.0-py3-none-any.whl (42 kB) Collecting cycler>=0.10 Using cached cycler-0.11.0-py3-none-any.whl (6.4 kB) Requirement already satisfied: numpy>=1.19 in ./envs/x86/lib/python3.9/site-packages (from matplotlib) (1.23.5) Collecting fonttools>=4.22.0 Using cached fonttools-4.38.0-py3-none-any.whl (965 kB) Collecting pyparsing>=2.2.1 Using cached pyparsing-3.0.9-py3-none-any.whl (98 kB) Collecting pillow>=6.2.0 Using cached Pillow-9.4.0-2-cp39-cp39-macosx_10_10_x86_64.whl (3.3 MB) Collecting python-dateutil>=2.7 Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) Collecting kiwisolver>=1.0.1 Using cached kiwisolver-1.4.4-cp39-cp39-macosx_10_9_x86_64.whl (65 kB) Requirement already satisfied: six>=1.5 in ./envs/x86/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0) Installing collected packages: python-dateutil, pyparsing, pillow, packaging, kiwisolver, fonttools, cycler, contourpy, matplotlib Successfully installed contourpy-1.0.7 cycler-0.11.0 fonttools-4.38.0 kiwisolver-1.4.4 matplotlib-3.6.3 packaging-23.0 pillow-9.4.0 pyparsing-3.0.9 python-dateutil-2.8.2 ```

This builds the environment. Requiring the CONDA_SUBDIR=osx-64 is not ideal, but somewhat reasonable in the preparation stage (which I hide in a Makefile). The slightly convoluted thing I do here is to use pip or another like PDM or Poetry to install requirements (in this case matlpotlib) and this is somehow causing the issue. If I include these pip dependencies in a - pip: section of the dependencies: then everything works, but I want to use one of these external tools to manage dependencies in pyproject.toml (see the discussion in #332).

The actual issue arises when I try to run something as a user without setting CONDA_SUBDIR=osx-64.

anaconda-project run shell ```bash $ anaconda-project run shell Collecting package metadata (current_repodata.json): ...working... done Solving environment: ...working... done ## Package Plan ## environment location: /Users/mforbes/tmp/python/arm/envs/x86 added / updated specs: - conda-forge::pyfftw The following packages will be SUPERSEDED by a higher-priority channel: openssl pkgs/main/osx-64::openssl-1.1.1s-hca7~ --> pkgs/main/osx-arm64::openssl-1.1.1s-h1a28f6b_0 Downloading and Extracting Packages Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done ```

When one omits the CONDA_SUBDIR=osx-64 (even though it is specified in the environment variables: section), the resolution engine somehow decides that decides that the osx-64 version of openssl should be replaced by the arm64 version. This breaks the environment by mixing architectures:

python -c "import ssl" ```bash $ python -c "import ssl" Traceback (most recent call last): File "", line 1, in File "/Users/mforbes/tmp/python/arm/envs/x86/lib/python3.9/ssl.py", line 99, in import _ssl # if we can't import it, let the error propagate ImportError: dlopen(/Users/mforbes/tmp/python/arm/envs/x86/lib/python3.9/lib-dynload/_ssl.cpython-39-darwin.so, 0x0002): Library not loaded: '@rpath/libssl.1.1.dylib' Referenced from: '/Users/mforbes/tmp/python/arm/envs/x86/lib/python3.9/lib-dynload/_ssl.cpython-39-darwin.so' Reason: tried: '/Users/mforbes/tmp/python/arm/envs/x86/lib/python3.9/lib-dynload/../../libssl.1.1.dylib' (mach-o file, but is an incompatible architecture (have (arm64), need (x86_64))), '/Users/mforbes/tmp/python/arm/envs/x86/lib/python3.9/lib-dynload/../../libssl.1.1.dylib' (mach-o file, but is an incompatible architecture (have (arm64), need (x86_64))), '/Users/mforbes/tmp/python/arm/envs/x86/bin/../lib/libssl.1.1.dylib' (mach-o file, but is an incompatible architecture (have (arm64), need (x86_64))), '/Users/mforbes/tmp/python/arm/envs/x86/bin/../lib/libssl.1.1.dylib' (mach-o file, but is an incompatible architecture (have (arm64), need (x86_64))), '/usr/local/lib/libssl.1.1.dylib' (no such file), '/usr/lib/libssl.1.1.dylib' (no such file) ```

Workarounds

There are several workarounds.

  1. Install everything in the dependences: (or packages:) section of the anaconda-project.yaml file, with pip dependencies in the appropriate section:
    # anaconda-project.yaml
    ...
    dependencies:
    - python=3.9
    - conda-forge::pyfftw
    - pip
    - pip:
      - matplotlib
    ....

    This seems to work.

  2. Make sure that the user sets CONDA_SUBDIR=osx-64 before running anything:
    CONDA_SUBDIR=osx-64 anaconda-project run shell

    This is not good though: I don't want to have to remember which projects require this. The whole point of using anaconda-project for me is that I can just do apr shell and I am good to go. (I alias apr=anaconda-project run).

Let me know if you need more details. This is a pretty elusive bug. I am not sure why the resolution engine is preferring the arm64 version and could imagine that this stops at some point when things are updated. Also, one needs an ARM machine to test:-/

I think the easiest solution is to provide some way of setting the variables defined in variables: before doing the resolution.

AlbertDeFusco commented 1 year ago

Thanks for the detailed report. Yes, variables: are only set after an environment has been prepared.

It seems like pyfftw has since been built for osx-arm64

Screenshot 2023-04-22 at 14 36 52

But there are other packages where this will happen. Is this project intended to only work on Mac osx-64? I would worry about forcing CONDA_SUBDIR to be set if someone on linux or windows attempted to run it they would have many other issues.

I like your use of a makefile here. So it would look like? There may be challenges with windows here.

ifneq ($(OS), Windows_NT)
  SHELL = /bin/bash
  UNAME_S := $(shell uname -s)
  ifeq ($(UNAME_S), Darwin)
    CONDA_SUBDIR = osx-64
  endif
endif

shell:
    CONDA_SUBDIR=$(CONDA_SUBDIR) anaconda-project run $(SHELL)
AlbertDeFusco commented 1 year ago

You may find that conda-project supports this use case a little better.

For instance conda-project