pytorch / ao

Create and integrate custom data types, layouts and kernels with up to 2x speedups and 65% less VRAM for inference and training
BSD 3-Clause "New" or "Revised" License
366 stars 52 forks source link

Issues with CPU only binaries install #370

Open drisspg opened 2 weeks ago

drisspg commented 2 weeks ago

Summary

Error 1

Running in colab if I want to only install the cpu version to my machine there are some issues: %pip install --pre torchao-nightly --index-url https://download.pytorch.org/whl/nightly/ # CPU only builds This in fact does not work and errors with:

ERROR: Cannot install torchao-nightly==2024.5.22, torchao-nightly==2024.5.23, torchao-nightly==2024.5.24, torchao-nightly==2024.5.25, torchao-nightly==2024.5.26, torchao-nightly==2024.5.27, torchao-nightly==2024.5.28, torchao-nightly==2024.5.29, torchao-nightly==2024.5.30+cpu, torchao-nightly==2024.5.31+cpu, torchao-nightly==2024.6.1+cpu, torchao-nightly==2024.6.10+cpu, torchao-nightly==2024.6.11+cpu, torchao-nightly==2024.6.12+cpu, torchao-nightly==2024.6.13+cpu, torchao-nightly==2024.6.14+cpu, torchao-nightly==2024.6.2+cpu, torchao-nightly==2024.6.3+cpu, torchao-nightly==2024.6.4+cpu, torchao-nightly==2024.6.5+cpu, torchao-nightly==2024.6.6+cpu, torchao-nightly==2024.6.7+cpu, torchao-nightly==2024.6.8+cpu and torchao-nightly==2024.6.9+cpu because these package versions have conflicting dependencies.

The conflict is caused by:
    torchao-nightly 2024.6.14+cpu depends on expecttest
    torchao-nightly 2024.6.13+cpu depends on expecttest
    torchao-nightly 2024.6.12+cpu depends on expecttest
    torchao-nightly 2024.6.11+cpu depends on expecttest
    torchao-nightly 2024.6.10+cpu depends on expecttest
    torchao-nightly 2024.6.9+cpu depends on expecttest
    torchao-nightly 2024.6.8+cpu depends on expecttest
    torchao-nightly 2024.6.7+cpu depends on expecttest
    torchao-nightly 2024.6.6+cpu depends on expecttest
    torchao-nightly 2024.6.5+cpu depends on expecttest
    torchao-nightly 2024.6.4+cpu depends on expecttest
    torchao-nightly 2024.6.3+cpu depends on expecttest
    torchao-nightly 2024.6.2+cpu depends on expecttest
    torchao-nightly 2024.6.1+cpu depends on expecttest
    torchao-nightly 2024.5.31+cpu depends on expecttest
    torchao-nightly 2024.5.30+cpu depends on expecttest
    torchao-nightly 2024.5.29 depends on expecttest
    torchao-nightly 2024.5.28 depends on expecttest
    torchao-nightly 2024.5.27 depends on expecttest
    torchao-nightly 2024.5.26 depends on expecttest
    torchao-nightly 2024.5.25 depends on expecttest
    torchao-nightly 2024.5.24 depends on expecttest
    torchao-nightly 2024.5.23 depends on expecttest
    torchao-nightly 2024.5.22 depends on expecttest

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Error 2

Lets say I preinstall the missing packags

%pip install expecttest
%pip install hypothesis
%pip install --pre torchao-nightly --index-url https://download.pytorch.org/whl/nightly/cpu # CPU only builds

I then get:

Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch->torchao-nightly) (2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch->torchao-nightly)
  ERROR: HTTP error 403 while getting https://download.pytorch.org/whl/nightly/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (from https://download.pytorch.org/whl/nightly/cpu/nvidia-cuda-nvrtc-cu12/)
ERROR: Could not install requirement nvidia-cuda-nvrtc-cu12==12.1.105 from https://download.pytorch.org/whl/nightly/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (from torch->torchao-nightly) because of HTTP error 403 Client Error: Forbidden for url: https://download.pytorch.org/whl/nightly/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl for URL https://download.pytorch.org/whl/nightly/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (from https://download.pytorch.org/whl/nightly/cpu/nvidia-cuda-nvrtc-cu12/)

Which looks lik eits trying to get nvidia realted things even though I am trying to only get the cpu build

If I run:

%pip install expecttest
%pip install hypothesis
%pip install --pre torchao-nightly --index-url https://download.pytorch.org/whl/nightly/

things work but installs nvidia runtime libraries

drisspg commented 2 weeks ago

cc @msaroufim @atalman

msaroufim commented 2 weeks ago

FWIW when I tried the below it installed rocm binaries on my cuda machine

pip install --pre torch--index-url https://download.pytorch.org/whl/nightly/

The default behavior is unpredictable so adding the architecture at the end is usually the right thing to do

Although packaging the CUDA runtime libraries does not feel right so that we'll fix

And we don't really need expecttest or hypothesis as dependencies so I just removed those here https://github.com/pytorch/ao/pull/369

atalman commented 1 week ago

hi @drisspg is this colab only issue ? I am running on linux and everything installs correctly:

pip install --pre torchao-nightly --index-url https://download.pytorch.org/whl/nightly/cpu --force-reinstall
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torchao-nightly
  Using cached https://download.pytorch.org/whl/nightly/cpu/torchao_nightly-2024.6.21%2Bcpu-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (273 kB)
Collecting torch==2.5.0.dev20240620 (from torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/cpu/torch-2.5.0.dev20240620%2Bcpu-cp311-cp311-linux_x86_64.whl (195.2 MB)
Collecting filelock (from torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/filelock-3.13.1-py3-none-any.whl (11 kB)
Collecting typing-extensions>=4.8.0 (from torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/typing_extensions-4.8.0-py3-none-any.whl (31 kB)
Collecting sympy (from torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting networkx (from torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/networkx-3.2.1-py3-none-any.whl (1.6 MB)
Collecting jinja2 (from torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/Jinja2-3.1.3-py3-none-any.whl (133 kB)
Collecting fsspec (from torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/fsspec-2024.2.0-py3-none-any.whl (170 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (28 kB)
Collecting mpmath>=0.19 (from sympy->torch==2.5.0.dev20240620->torchao-nightly)
  Using cached https://download.pytorch.org/whl/nightly/mpmath-1.2.1-py3-none-any.whl (532 kB)
Installing collected packages: mpmath, typing-extensions, sympy, networkx, MarkupSafe, fsspec, filelock, jinja2, torch, torchao-nightly
  Attempting uninstall: mpmath
    Found existing installation: mpmath 1.3.0
    Uninstalling mpmath-1.3.0:
      Successfully uninstalled mpmath-1.3.0
  Attempting uninstall: typing-extensions
    Found existing installation: typing_extensions 4.11.0
    Uninstalling typing_extensions-4.11.0:
      Successfully uninstalled typing_extensions-4.11.0
  Attempting uninstall: sympy
    Found existing installation: sympy 1.12
    Uninstalling sympy-1.12:
      Successfully uninstalled sympy-1.12
  Attempting uninstall: networkx
    Found existing installation: networkx 3.2.1
    Uninstalling networkx-3.2.1:
      Successfully uninstalled networkx-3.2.1
  Attempting uninstall: MarkupSafe
    Found existing installation: MarkupSafe 2.1.3
    Uninstalling MarkupSafe-2.1.3:
      Successfully uninstalled MarkupSafe-2.1.3
  Attempting uninstall: fsspec
    Found existing installation: fsspec 2024.2.0
    Uninstalling fsspec-2024.2.0:
      Successfully uninstalled fsspec-2024.2.0
  Attempting uninstall: filelock
    Found existing installation: filelock 3.13.1
    Uninstalling filelock-3.13.1:
      Successfully uninstalled filelock-3.13.1
  Attempting uninstall: jinja2
    Found existing installation: Jinja2 3.1.4
    Uninstalling Jinja2-3.1.4:
      Successfully uninstalled Jinja2-3.1.4
  Attempting uninstall: torch
    Found existing installation: torch 2.5.0.dev20240620+cpu
    Uninstalling torch-2.5.0.dev20240620+cpu:
      Successfully uninstalled torch-2.5.0.dev20240620+cpu
  Attempting uninstall: torchao-nightly
    Found existing installation: torchao-nightly 2024.6.21+cpu
    Uninstalling torchao-nightly-2024.6.21+cpu:
      Successfully uninstalled torchao-nightly-2024.6.21+cpu
Successfully installed MarkupSafe-2.1.5 filelock-3.13.1 fsspec-2024.2.0 jinja2-3.1.3 mpmath-1.2.1 networkx-3.2.1 sympy-1.12 torch-2.5.0.dev20240620+cpu torchao-nightly-2024.6.21+cpu typing-extensions-4.8.0
drisspg commented 1 week ago

Yeah I have only seen this on colab