ai4d-iasc / trixie

Scripts and documentation about trixie hpc
17 stars 3 forks source link

Installing mxnet in a conda environment #57

Closed SamuelLarkin closed 3 years ago

SamuelLarkin commented 3 years ago

Hi, I'm trying to create a conda environment for Masked Language Model Scoring which requires to have pytorch and mxnet-cu101mkl as documented (pip install torch mxnet-cu101mkl)

Here's the command I'm running:

conda create --name mlm-scoring python=3.8
conda activate mlm-scoring
conda install cudatoolkit=10.1 cudnn
conda install regex

pip install mxnet-cu101mkl

But when I run pip install mxnet-cu101mkl, I get the following error message most likely due to the fact that there is a mismatch between CentOS and the available mxnet.

Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/avx512, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/avx2, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
ERROR: Could not find a version that satisfies the requirement mxnet-cu101mkl
ERROR: No matching distribution found for mxnet-cu101mkl
(mlm-scoring) [larkins@hn2 mlm-scoring]$ which pip
/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/mlm-scoring/bin/pip

Even if I try to specify a version, it fails.

pip install mxnet-cu101mkl==1.6.0
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/avx512, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/avx2, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
ERROR: Could not find a version that satisfies the requirement mxnet-cu101mkl==1.6.0
ERROR: No matching distribution found for mxnet-cu101mkl==1.6.0

Why can't I install mxnet anymore?

I used to be able to install mxnet as I have a conda environment named sockeye-1.18.115_cu101 which contains mxnet-cu101mkl==1.6.0. Here's the output of conda env export --name sockeye-1.18.115_cu101:

name: sockeye-1.18.115_cu101
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _tflow_select=2.3.0=mkl
  - absl-py=0.9.0=py37_0
  - alabaster=0.7.12=py37_0
  - asn1crypto=1.3.0=py37_0
  - astor=0.8.0=py37_0
  - babel=2.8.0=py_0
  - blas=1.0=mkl
  - blinker=1.4=py37_0
  - c-ares=1.15.0=h7b6447c_1001
  - ca-certificates=2020.1.1=0
  - cachetools=3.1.1=py_0
  - certifi=2020.4.5.1=py37_0
  - cffi=1.14.0=py37h2e261b9_0
  - chardet=3.0.4=py37_1003
  - click=7.1.1=py_0
  - cryptography=2.8=py37h1ba5d50_0
  - cudatoolkit=10.1.243=h6bb024c_0
  - cudnn=7.6.5=cuda10.1_0
  - cycler=0.10.0=py37_0
  - dbus=1.13.12=h746ee38_0
  - docutils=0.16=py37_0
  - expat=2.2.6=he6710b0_0
  - fontconfig=2.13.0=h9420a91_0
  - freetype=2.9.1=h8a8886c_1
  - gast=0.2.2=py37_0
  - glib=2.63.1=h5a9c865_0
  - google-auth=1.13.1=py_0
  - google-auth-oauthlib=0.4.1=py_2
  - google-pasta=0.2.0=py_0
  - grpcio=1.27.2=py37hf8bcb03_0
  - gst-plugins-base=1.14.0=hbbd80ab_1
  - gstreamer=1.14.0=hb453b48_1
  - h5py=2.10.0=py37h7918eee_0
  - hdf5=1.10.4=hb1b8bf9_0
  - icu=58.2=h9c2bf20_1
  - idna=2.9=py_1
  - imagesize=1.2.0=py_0
  - intel-openmp=2020.0=166
  - jinja2=2.11.1=py_0
  - jpeg=9b=h024ee3a_2
  - keras-applications=1.0.8=py_0
  - keras-preprocessing=1.1.0=py_1
  - kiwisolver=1.1.0=py37he6710b0_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libprotobuf=3.11.4=hd408876_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libuuid=1.0.3=h1bed415_2
  - libxcb=1.13=h1bed415_1
  - libxml2=2.9.9=hea5a465_1
  - markdown=3.1.1=py37_0
  - markupsafe=1.1.1=py37h7b6447c_0
  - matplotlib=3.1.3=py37_0
  - matplotlib-base=3.1.3=py37hef1b27d_0
  - mkl=2020.0=166
  - mkl-service=2.3.0=py37he904b0f_0
  - mkl_fft=1.0.15=py37ha843d7b_0
  - mkl_random=1.1.0=py37hd6b4f25_0
  - ncurses=6.2=he6710b0_0
  - numpy=1.18.1=py37h4f9e942_0
  - numpy-base=1.18.1=py37hde5b4d6_1
  - oauthlib=3.1.0=py_0
  - openssl=1.1.1f=h7b6447c_0
  - opt_einsum=3.1.0=py_0
  - packaging=20.3=py_0
  - pcre=8.43=he6710b0_0
  - pip=20.0.2=py37_1
  - protobuf=3.11.4=py37he6710b0_0
  - pyasn1=0.4.8=py_0
  - pyasn1-modules=0.2.7=py_0
  - pycparser=2.20=py_0
  - pygments=2.6.1=py_0
  - pyjwt=1.7.1=py37_0
  - pyopenssl=19.1.0=py37_0
  - pyparsing=2.4.6=py_0
  - pyqt=5.9.2=py37h05f1152_2
  - pysocks=1.7.1=py37_0
  - python=3.7.7=hcf32534_0_cpython
  - python-dateutil=2.8.1=py_0
  - pytz=2019.3=py_0
  - qt=5.9.7=h5867ecd_1
  - readline=8.0=h7b6447c_0
  - requests=2.23.0=py37_0
  - requests-oauthlib=1.3.0=py_0
  - rsa=4.0=py_0
  - scipy=1.4.1=py37h0b6359f_0
  - setuptools=46.1.3=py37_0
  - sip=4.19.8=py37hf484d3e_0
  - six=1.14.0=py37_0
  - snowballstemmer=2.0.0=py_0
  - sphinx=2.4.4=py_0
  - sphinxcontrib-applehelp=1.0.2=py_0
  - sphinxcontrib-devhelp=1.0.2=py_0
  - sphinxcontrib-htmlhelp=1.0.3=py_0
  - sphinxcontrib-jsmath=1.0.1=py_0
  - sphinxcontrib-qthelp=1.0.3=py_0
  - sphinxcontrib-serializinghtml=1.1.4=py_0
  - sqlite=3.31.1=h7b6447c_0
  - tensorboard=2.1.0=py3_0
  - tensorflow=2.1.0=mkl_py37h80a91df_0
  - tensorflow-base=2.1.0=mkl_py37h6d63fb7_0
  - tensorflow-estimator=2.1.0=pyhd54b08b_0
  - termcolor=1.1.0=py37_1
  - tk=8.6.8=hbc83047_0
  - tornado=6.0.4=py37h7b6447c_1
  - urllib3=1.25.8=py37_0
  - werkzeug=1.0.1=py_0
  - wheel=0.34.2=py37_0
  - wrapt=1.12.1=py37h7b6447c_1
  - xz=5.2.4=h14c3975_4
  - zlib=1.2.11=h7b6447c_3
  - pip:
    - dataclasses-json==0.5.2
    - decorator==4.4.2
    - marshmallow==3.10.0
    - marshmallow-enum==1.5.1
    - more-itertools==8.6.0
    - mxboard==0.1.0
    - mxnet-cu101mkl==1.6.0
    - mxnet-mkl==1.5.1
    - mypy-extensions==0.4.3
    - networkx==2.0
    - pillow==7.1.1
    - portalocker==1.7.0
    - pudb==2019.2
    - python-graphviz==0.8.4
    - sacrebleu==1.3.6
    - sentencepiece==0.1.85
    - sockeye==1.18.115
    - stringcase==1.2.0
    - subword-nmt==0.3.7
    - tabulate==0.8.7
    - tqdm==4.54.1
    - typing==3.7.4.1
    - typing-extensions==3.7.4.3
    - typing-inspect==0.6.0
    - urwid==2.1.0
prefix: /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-1.18.115_cu101
fieldsa commented 3 years ago

For a certain reason it seems unset PYTHONPATH env var, prior to activating the conda environment, does the required work to make this pip install under conda succeed. I was able to replicate the original issue, having the command (pip install mxnet-cu101mkl fail when it searches the compute canada PYTHONPATH site-packages first.

[fieldsa@cn101 ~]$ module load cuda/10.1 cudnn anaconda
Anaconda3-2020.11: Please don't forget to type: . activate

Due to MODULEPATH changes, the following have been reloaded:
  1) openmpi/3.1.2

[fieldsa@cn101 ~]$ echo $PYTHONPATH   
/cvmfs/soft.computecanada.ca/custom/python/site-packages
[fieldsa@cn101 ~]$ unset PYTHONPATH                                         #<=====
[fieldsa@cn101 ~]$ conda create --name mlm-scoring-py38 python=3.8
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.9.2
  latest version: 4.10.1

Please update conda by running

    $ conda update -n base conda

## Package Plan ##

  environment location: /home/fieldsa/.conda/envs/mlm-scoring-py38

  added / updated specs:
    - python=3.8

The following NEW packages will be INSTALLED:

  _libgcc_mutex      conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
  _openmp_mutex      conda-forge/linux-64::_openmp_mutex-4.5-1_gnu
  ca-certificates    conda-forge/linux-64::ca-certificates-2020.12.5-ha878542_0
  certifi            conda-forge/linux-64::certifi-2020.12.5-py38h578d9bd_1
  ld_impl_linux-64   conda-forge/linux-64::ld_impl_linux-64-2.35.1-hea4e1c9_2
  libffi             conda-forge/linux-64::libffi-3.3-h58526e2_2
  libgcc-ng          conda-forge/linux-64::libgcc-ng-9.3.0-h2828fa1_19
  libgomp            conda-forge/linux-64::libgomp-9.3.0-h2828fa1_19
  libstdcxx-ng       conda-forge/linux-64::libstdcxx-ng-9.3.0-h6de172a_19
  ncurses            conda-forge/linux-64::ncurses-6.2-h58526e2_4
  openssl            conda-forge/linux-64::openssl-1.1.1k-h7f98852_0
  pip                conda-forge/noarch::pip-21.1.1-pyhd8ed1ab_0
  python             conda-forge/linux-64::python-3.8.10-h49503c6_1_cpython
  python_abi         conda-forge/linux-64::python_abi-3.8-1_cp38
  readline           conda-forge/linux-64::readline-8.1-h46c0cb4_0
  setuptools         conda-forge/linux-64::setuptools-49.6.0-py38h578d9bd_3
  sqlite             conda-forge/linux-64::sqlite-3.35.5-h74cdb3f_0
  tk                 conda-forge/linux-64::tk-8.6.10-h21135ba_1
  wheel              conda-forge/noarch::wheel-0.36.2-pyhd3deb0d_0
  xz                 conda-forge/linux-64::xz-5.2.5-h516909a_1
  zlib               conda-forge/linux-64::zlib-1.2.11-h516909a_1010

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use   
#
#     $ conda activate mlm-scoring-py38
#
# To deactivate an active environment, use
#
#     $ conda deactivate

[fieldsa@cn101 ~]$ conda activate mlm-scoring-py38
(mlm-scoring-py38) [fieldsa@cn101 ~]$ pip install mxnet-cu101mkl
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/avx512, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/avx2, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/nix/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Collecting mxnet-cu101mkl
  Using cached mxnet_cu101mkl-1.6.0.post0-py2.py3-none-manylinux1_x86_64.whl (712.3 MB)
Collecting requests<3,>=2.20.0
  Using cached requests-2.25.1-py2.py3-none-any.whl (61 kB)
Collecting graphviz<0.9.0,>=0.8.1
  Using cached graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting numpy<2.0.0,>1.16.0
  Downloading numpy-1.20.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.4 MB)
     |████████████████████████████████| 15.4 MB 4.6 MB/s 
Collecting idna<3,>=2.5
  Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.4-py2.py3-none-any.whl (153 kB)
Collecting chardet<5,>=3.0.2
  Using cached chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Requirement already satisfied: certifi>=2017.4.17 in ./.conda/envs/mlm-scoring-py38/lib/python3.8/site-packages (from requests<3,>=2.20.0->mxnet-cu101mkl) (2020.12.5)
Installing collected packages: urllib3, idna, chardet, requests, numpy, graphviz, mxnet-cu101mkl
Successfully installed chardet-4.0.0 graphviz-0.8.4 idna-2.10 mxnet-cu101mkl-1.6.0.post0 numpy-1.20.3 requests-2.25.1 urllib3-1.26.4
(mlm-scoring-py38) [fieldsa@cn101 ~]$ conda list
# packages in environment at /home/fieldsa/.conda/envs/mlm-scoring-py38:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
certifi                   2020.12.5        py38h578d9bd_1    conda-forge
chardet                   4.0.0                    pypi_0    pypi
idna                      2.10                     pypi_0    pypi
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_19    conda-forge
libgomp                   9.3.0               h2828fa1_19    conda-forge
libstdcxx-ng              9.3.0               h6de172a_19    conda-forge
mxnet-cu101mkl            1.6.0.post0              pypi_0    pypi
ncurses                   6.2                  h58526e2_4    conda-forge
numpy                     1.20.3                   pypi_0    pypi
openssl                   1.1.1k               h7f98852_0    conda-forge
pip                       21.1.1             pyhd8ed1ab_0    conda-forge
python                    3.8.10          h49503c6_1_cpython    conda-forge
python-graphviz           0.8.4                    pypi_0    pypi
python_abi                3.8                      1_cp38    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
requests                  2.25.1                   pypi_0    pypi
setuptools                49.6.0           py38h578d9bd_3    conda-forge
sqlite                    3.35.5               h74cdb3f_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
urllib3                   1.26.4                   pypi_0    pypi
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
(mlm-scoring-py38) [fieldsa@cn101 ~]$ 
SamuelLarkin commented 3 years ago

So the question is why does conda restrict itself to PYTHONPATH to find the package and why it doesn't go on the web to fetch it?

fieldsa commented 3 years ago

Good question. This question may be best answered by anaconda project. I don't have a ready answer to explain the behavior of the environment settings in this case, except that they are inherited from conda or it's default configuration.