NVIDIA / tensorflow

An Open Source Machine Learning Framework for Everyone
https://developer.nvidia.com/deep-learning-frameworks
Apache License 2.0
962 stars 144 forks source link

latest release on nvidia-pyindex is broken #85

Closed bertsky closed 1 year ago

bertsky commented 1 year ago

System information

Describe the problem

When installing via pip:

pip install nvidia-pyindex && pip install nvidia_tensorflow

I end up with:

pip install nvidia_tensorflow==1.15.5+nv23.3
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting nvidia_tensorflow==1.15.5+nv23.3
  Downloading https://developer.download.nvidia.com/compute/redist/nvidia-tensorflow/nvidia_tensorflow-1.15.5%2Bnv23.03-7472065-cp38-cp38-linux_x86_64.whl (397.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 397.8/397.8 MB 313.4 MB/s eta 0:00:00
Collecting keras-applications>=1.0.8
  Downloading Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.7/50.7 kB 41.5 MB/s eta 0:00:00
Collecting tensorrt~=8.5
  Downloading tensorrt-8.6.0-cp38-none-manylinux_2_17_x86_64.whl (819.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.2/819.2 MB 167.6 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu12~=12.0
  Downloading nvidia_cusparse_cu12-12.0.2.55-py3-none-manylinux1_x86_64.whl (188.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.8/188.8 MB 167.6 MB/s eta 0:00:00
Collecting protobuf<4.0.0,>=3.6.1
  Downloading protobuf-3.20.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 403.7 MB/s eta 0:00:00
Requirement already satisfied: wheel>=0.26 in /usr/local/sub-venv/headless-tf1/lib/python3.8/site-packages (from nvidia_tensorflow==1.15.5+nv23.3) (0.40.0)
Collecting keras-preprocessing>=1.0.5
  Downloading Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.6/42.6 kB 183.9 MB/s eta 0:00:00
Collecting tensorboard<1.16.0,>=1.15.0
  Downloading https://developer.download.nvidia.com/compute/redist/tensorboard/tensorboard-1.15.0-py2.py3-none-any.whl (1.6 kB)
Collecting astor==0.8.1
  Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB)
Collecting nvidia-cuda-cupti-cu12~=12.1
  Downloading nvidia_cuda_cupti_cu12-12.1.62-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 112.3 MB/s eta 0:00:00
Collecting numpy<1.24,>=1.22.0
  Downloading numpy-1.23.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 151.4 MB/s eta 0:00:00
Collecting absl-py>=0.9.0
  Downloading absl_py-1.4.0-py3-none-any.whl (126 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.5/126.5 kB 243.6 MB/s eta 0:00:00
Collecting tensorflow-estimator==1.15.1
  Downloading tensorflow_estimator-1.15.1-py2.py3-none-any.whl (503 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 503.4/503.4 kB 254.4 MB/s eta 0:00:00
Collecting nvidia-curand-cu12~=10.3
  Downloading nvidia_curand_cu12-10.3.2.56-py3-none-manylinux1_x86_64.whl (56.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 206.0 MB/s eta 0:00:00
Collecting nvidia-cuda-nvcc-cu12~=12.1
  Downloading nvidia_cuda_nvcc_cu12-12.1.66-py3-none-manylinux1_x86_64.whl (20.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.0/20.0 MB 103.3 MB/s eta 0:00:00
Collecting termcolor>=1.1.0
  Downloading termcolor-2.2.0-py3-none-any.whl (6.6 kB)
Collecting astunparse==1.6.3
  Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting nvidia-cudnn-cu12~=8.8
  Downloading nvidia_cudnn_cu12-8.8.1.3-py3-none-manylinux1_x86_64.whl (718.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 718.4/718.4 MB 177.8 MB/s eta 0:00:00
Requirement already satisfied: six>=1.10.0 in /usr/local/sub-venv/headless-tf1/lib/python3.8/site-packages (from nvidia_tensorflow==1.15.5+nv23.3) (1.16.0)
Collecting opt-einsum>=2.3.2
  Downloading opt_einsum-3.3.0-py3-none-any.whl (65 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.5/65.5 kB 145.1 MB/s eta 0:00:00
Collecting grpcio>=1.8.6
  Downloading grpcio-1.53.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 226.0 MB/s eta 0:00:00
Collecting gast==0.3.3
  Downloading gast-0.3.3-py2.py3-none-any.whl (9.7 kB)
Collecting google-pasta>=0.1.6
  Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.5/57.5 kB 197.3 MB/s eta 0:00:00
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/sub-venv/headless-tf1/lib/python3.8/site-packages (from nvidia_tensorflow==1.15.5+nv23.3) (1.15.0)
Collecting nvidia-cuda-runtime-cu12~=12.1
  Downloading nvidia_cuda_runtime_cu12-12.1.55-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.5/823.5 kB 360.6 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12~=11.0
  Downloading nvidia_cufft_cu12-11.0.2.4-py3-none-manylinux1_x86_64.whl (121.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 170.8 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12~=11.4
  Downloading nvidia_cusolver_cu12-11.4.4.55-py3-none-manylinux1_x86_64.whl (131.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 131.0/131.0 MB 141.5 MB/s eta 0:00:00
Collecting nvidia-dali-nvtf-plugin==1.23.0+nv23.03
  Downloading https://developer.download.nvidia.com/compute/redist/nvidia-dali-nvtf-plugin/nvidia_dali_nvtf_plugin-1.23.0%2Bnv23.03-7472065-cp38-cp38-linux_x86_64.whl (127 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.2/127.2 kB 235.2 MB/s eta 0:00:00
Collecting h5py==2.10.0
  Downloading h5py-2.10.0-cp38-cp38-manylinux1_x86_64.whl (2.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 213.1 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12~=12.1
  Downloading nvidia_cublas_cu12-12.1.0.26-py3-none-manylinux1_x86_64.whl (379.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 379.5/379.5 MB 169.4 MB/s eta 0:00:00
ERROR: Could not find a version that satisfies the requirement nvidia-nccl-cu12~=2.17 (from nvidia-tensorflow) (from versions: 0.0.1.dev5, 2.16.2, 2.16.5)
ERROR: No matching distribution found for nvidia-nccl-cu12~=2.17

The problem goes away when downgrading to the previous release, nvidia_tensorflow==1.15.5+nv23.2

Provide the exact sequence of commands / steps that you executed before running into the problem

see above

Any other info / logs

Environment is a Docker container based on nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04

nluehr commented 1 year ago

Thanks for reporting. It seems nvidia-nccl-cu12 2.17 has not been published to Pypi yet. I'll look into why that hasn't happened yet.

bertsky commented 1 year ago

I just noticed there's a discrepancy between the Py index and the Github release – the latter still has 1.15.5+nv23.02 as tip.

(Perhaps someone forgot to push the 23.03 branch?)

nluehr commented 1 year ago

@bertsky the 23.03 branch has now been published. Just waiting on NCCL wheels now...

bertsky commented 1 year ago

It worked – thanks!