conda-forge / conda-forge.github.io

The conda-forge website.
https://conda-forge.org
BSD 3-Clause "New" or "Revised" License
132 stars 277 forks source link

Py311_torchvision_issue #2224

Open sanujv opened 4 months ago

sanujv commented 4 months ago

Conda-forge documentation

Installed packages

Issue in updating all the modules, please share some other mode to share the details.

# packages in environment at /opt/python/python311:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda/anaconda311/pkgs/cpurepo
_openmp_mutex             4.5                  2_kmp_llvm    conda/anaconda311/pkgs/cpurepo
_py-xgboost-mutex         2.0                       cpu_0    conda/anaconda311/pkgs/cpurepo
_sysroot_linux-64_curr_repodata_hack 3                    h5bd9786_2

Environment info

root@lppbd4937[(IPC1-SLV-PKGMGT-TSTSRV) ~]# /opt/python/python311/bin/conda info

     active environment : None
       user config file : /root/.condarc
 populated config files : /opt/python/python311/.condarc
          conda version : 24.1.2
    conda-build version : not installed
         python version : 3.11.5.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=icelake
                          __conda=24.1.2=0
                          __cuda=12.0=0
                          __glibc=2.28=0
                          __linux=4.18.0=0
                          __unix=0=0
       base environment : /opt/python/python311  (writable)
      conda av data dir : /opt/python/python311/etc/conda
  conda av metadata url : None
           channel URLs : https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/common/linux-64
                          https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/common/noarch
                          https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/cpurepo/linux-64
                          https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/cpurepo/noarch
          package cache : /opt/python/python311/pkgs
       envs directories : /opt/python/python311/envs
                          /root/.conda/envs
               platform : linux-64
             user-agent : conda/24.1.2 requests/2.31.0 CPython/3.11.5 Linux/4.18.0-513.24.1.el8_9.x86_64 rhel/8.9 glibc/2.28 solver/libmamba conda-libmamba-solver/23.12.0 libmambapy/1.5.3
                UID:GID : 0:0
             netrc file : None
           offline mode : False

Issue

There few modules, which we have in python37, that we are testing in in python311 is not working as expected due to the cross dependency and multiple inconsistencies in the few dependency packages.The below are few which are creating conflict. Please help us to fix this torchvison pytorch icu scikit-learn matplotlib-base krb5

sanujv commented 4 months ago

The Info needs to be updated. We have both cuda and non-cuda environment

/opt/python/python311/bin/conda info

     active environment : None
       user config file : /root/.condarc
 populated config files : /opt/python/python311/.condarc
          conda version : 24.1.2
    conda-build version : not installed
         python version : 3.11.5.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=cascadelake
                          __conda=24.1.2=0
                          __glibc=2.28=0
                          __linux=4.18.0=0
                          __unix=0=0
       base environment : /opt/python/python311  (writable)
      conda av data dir : /opt/python/python311/etc/conda
  conda av metadata url : None
           channel URLs : https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/common/linux-64
                          https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/common/noarch
                          https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/cpurepo/linux-64
                          https://10.16.126.163/install/repositories/conda/anaconda311/pkgs/cpurepo/noarch
          package cache : /opt/python/python311/pkgs
       envs directories : /opt/python/python311/envs
                          /root/.conda/envs
               platform : linux-64
             user-agent : conda/24.1.2 requests/2.31.0 CPython/3.11.5 Linux/4.18.0-513.24.1.el8_9.x86_64 rhel/8.9 glibc/2.28 solver/libmamba conda-libmamba-solver/24.1.0 libmambapy/1.5.6
                UID:GID : 0:0
             netrc file : None
           offline mode : False
hmaarrfk commented 4 months ago

please provide the full output of the command conda list.

sanujv commented 4 months ago

conda-list.txt

sanujv commented 4 months ago

Attached the complete list

hmaarrfk commented 4 months ago

As an open source volunteering run project, I don't think you'll be able to get much help troubleshooting anything you've downloaded behind your mirror.

As previously stated in https://github.com/conda-forge/conda-forge.github.io/issues/2220#issuecomment-2218566018, the main question is whether or not you can recreate with the command:

mamba create --channel conda-forge --override-channels --name test python=3.11 "tensorflow>=2.16.2" torchvision
mamba activate test
python -c "SOME CODE THAT CREATES A CRASH"

which forces the creation of an environment using only conda-forge's channel

You've also not given any details as to what "doesn't work".

You really need to be extra specific about what you need help with, providing us with a step by step instruction to recreate your full environment if you expect any kind of resolution. I simply can't reach your 10.16.126.163 server, thus it is REALLY difficult for me to troubleshoot anything.

I've tried:

$ python -c "import tensorflow; print(tensorflow.__version__); import torchvision; print(torchvision.__version__)"
2024-07-13 09:04:46.299750: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:10575] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-13 09:04:46.299802: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:479] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-13 09:04:46.301630: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1442] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-07-13 09:04:46.307905: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2.16.2
0.18.1a0+405940f

and as you can see, both versions print.

What I do notice from your conda-list, is that you are using a package called pytorch-mutex. This is a package that comes from the pytorch channel, and not one that we have any control over.

https://anaconda.org/search?q=pytorch-mutex

We have over the years attempted to collaborate with the pytorch team on this and we collectively haven't agreed on package names, so the pytorch-mutex is a good tell-tell sign.

If you can recreate the issue with the conda-forge channel enabled only, then I think we can further look into this.

sanujv commented 4 months ago

Hi,

The ip 10.16.126.163 is our local repository created , with the packages downloaded from conda-forge /anaconda. It wont be accessible from outside our network. The issue we are facing is, we want to install few modules like , torchvision in our server and it is not happening getting the below error. Even the required dependency is available in the repo, its not happening.

we need help in getting this sorted.

ne Solving environment: failed

LibMambaUnsatisfiableError: Encountered problems while solving:

Could not solve for environment specs The following package could not be installed \u2514\u2500 torchvision is not installable because it requires \u2514\u2500 pytorch 2.3. cpu but there are no viable options \u251c\u2500 pytorch 2.3.0 would require \u2502 \u2514\u2500 libtorch 2.3.0. , which requires \u2502 \u2514\u2500 pytorch 2.3.0 cpugeneric_0, which can be installed; \u2514\u2500 pytorch 2.3.0 would require \u2514\u2500 nomkl, which does not exist (perhaps a missing channel).

/opt/python/python311/bin/conda search torch Loading channels: done No match found for: torch. Search: torch

Name Version Build Channel

libopenvino-pytorch-frontend 2024.0.0 he02047a_5 conda/anaconda311/pkgs/cpurepo libopenvino-pytorch-frontend 2024.1.0 he02047a_7 conda/anaconda311/pkgs/cpurepo libtorch 2.3.0 cpu_generic_h0ec3652_0 conda/anaconda311/pkgs/cpurepo libtorch 2.3.1 cpu_generic_h970db74_0 conda/anaconda311/pkgs/cpurepo pytorch 2.3.0 cpu_generic_py311h8ca351a_1 conda/anaconda311/pkgs/cpurepo pytorch 2.3.0 cpu_mkl_py311hcb16b95_101 conda/anaconda311/pkgs/cpurepo pytorch-mutex 1.0 cpu conda/anaconda311/pkgs/cpurepo pytorch-pretrained-bert 0.6.2 py311h38be061_1 conda/anaconda311/pkgs/cpurepo torchtext 0.15.2 py311h140f690_5 conda/anaconda311/pkgs/cpurepo torchvision 0.18.1 cpu_py311hf0a5325_1 conda/anaconda311/pkgs/cpurepo

sanujv commented 4 months ago

If you still need any clarity on this issue, please schedule a connect , so that we can discuss further.

carterbox commented 4 months ago

The issue we are facing is, we want to install few modules like , torchvision in our server

We will not help debug a private mirror of conda-forge, and we do not support mixing conda-forge with other channels.

sanujv commented 4 months ago

We are unable to recreate with the below coammand..

mamba create --channel conda-forge --override-channels --name test python=3.11 "tensorflow>=2.16.2" torchvision

mamba create --channel conda-forge --override-channels --name test python=3.11 "tensorflow>=2.16.2" torchvision -bash: mamba: command not found

/opt/python/python311/bin/mamba install --channel conda-forge --override-channels --name test Not a conda environment: /opt/python/python311/envs/test

EnvironmentLocationNotFound: Not a conda environment: /opt/python/python311/envs/test

installed mamba package from conda-forge. The error we are getting/the issue we are facing is...after installing tensorflow..we are unable to install torchvision. Both of them are from conda-forge.Currently there is only minimal modules in the server.

/opt/python/python311/bin/conda install torchvision Channels:

LibMambaUnsatisfiableError: Encountered problems while solving:

Could not solve for environment specs The following package could not be installed \u2514\u2500 torchvision is not installable because it requires \u2514\u2500 pytorch 2.3. cpu but there are no viable options \u251c\u2500 pytorch 2.3.0 would require \u2502 \u2514\u2500 libtorch 2.3.0. , which requires \u2502 \u2514\u2500 pytorch 2.3.0 cpugeneric_0, which can be installed; \u2514\u2500 pytorch 2.3.0 would require \u2514\u2500 nomkl, which does not exist (perhaps a missing channel).

attaching the current conda list. base_and_tensorflow.txt

sanujv commented 4 months ago

Hi Team,

Please respond

hmaarrfk commented 3 months ago

You may use conda instead of mamba with the command:

conda create --channel conda-forge --override-channels --name test python=3.11 "tensorflow>=2.16.2" torchvision

however, we cannot support channels other than conda-forge nor a channel we have no visibility into.