turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 236 forks source link

GPU not supported on Nvidia Jetson AGX with JetPack 5.1 #301

Open davidtheITguy opened 5 months ago

davidtheITguy commented 5 months ago

Hello,

I'm wondering if anyone has been able to get exllamav2 to work with the Jetson AGX? The requirements install removes the Nvidia CUDA libs and installs a base torch-2.1.2. Unfortunately, that won't work with the Jetson AGX. Here is my run output:

`python test_inference.py -m /ssd/llama-2/llama-2-7b-chat-hf -p "Once upon a time," Traceback (most recent call last): File "test_inference.py", line 2, in from exllamav2 import( File "/ssd/exllamav2/exllamav2/init.py", line 3, in from exllamav2.model import ExLlamaV2 File "/ssd/exllamav2/exllamav2/model.py", line 16, in from exllamav2.config import ExLlamaV2Config File "/ssd/exllamav2/exllamav2/config.py", line 2, in from exllamav2.fasttensors import STFile File "/ssd/exllamav2/exllamav2/fasttensors.py", line 5, in from exllamav2.ext import exllamav2_ext as ext_c File "/ssd/exllamav2/exllamav2/ext.py", line 142, in exllamav2_ext = load \ File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1284, in load return _jit_compile( File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile _write_ninja_file_and_build_library( File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1611, in _write_ninja_file_and_build_library _write_ninja_file_to_build_library( File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2007, in _write_ninja_file_to_build_library cuda_flags = common_cflags + COMMON_NVCC_FLAGS + _get_cuda_arch_flags()

File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1786, in _get_cuda_arch_flagsraise ValueError(f"Unknown CUDA arch ({arch}) or GPU not supported") ValueError: Unknown CUDA arch (8.7+PTX) or GPU not supported`

FWIW here is the latest (as of this post) Nvidia CUDA library for the AGX: 2.0.0a0+8aa34602.nv23.03.

Hoping someone has a workaround. Thank you

turboderp commented 5 months ago

I think 8.7 was added to the Torch whitelist fairly late last year so I'm not sure what the status is for Torch 2.1.2.

You could try exporting TORCH_CUDA_ARCH_LIST="8.7+PTX" to see if that makes a difference. Otherwise Torch 2.2 just released, so that might behave differently, though I haven't had a chance to test it yet.

Could you clarify what you mean by the requirements install removing the NVIDIA CUDA libs? It shouldn't affect those, and if you already have torch>=2.1.0 installed (which should match against 2.1.2+cuxxx too) in your (v)env, it shouldn't affect that install.

davidtheITguy commented 5 months ago

Hello,

First, thank you for the response and suggestions, I will certainly try the export you suggest.

Second, I apologize for not being more precise in my original explanation.

It appears that the the pip installation command I am using is failing on a torch dependency issue on my Nvidia Jetson.

More detail:

  1. I'm using a conda environment specifically for exllamav2
  2. pip list command shows the Nvidia JetPack version of torch installed prior to exllamav2 install:

tokenizers 0.15.1 torch 2.0.0a0+8aa34602.nv23.3 tqdm 4.66.1 transformers 4.37.1 typing_extensions 4.9.0 tzdata 2023.4 urllib3 2.1.0 websockets 12.0 wheel 0.41.2 (exllamav2) @.***:~$

  1. Here is the installation command I am using which seems to closest match the Nvidia JetPack 5.1 and arch:

export TORCH_INSTALL= https://developer.download.nvidia.com/compute/redist/jp/v51/pytorch/torch-2.0.0a0+8aa34602.nv23.03-cp38-cp38-linux_aarch64.whl

python3 -m pip install --no-cache $TORCH_INSTALL

  1. Here is the installation output and dependency error:

    python3 -m pip install --no-cache $TORCH_INSTALL Collecting torch==2.0.0a0+8aa34602.nv23.03 Downloading https://developer.download.nvidia.com/compute/redist/jp/v51/pytorch/torch-2.0.0a0+8aa34602.nv23.03-cp38-cp38-linux_aarch64.whl (167.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.2/167.2 MB 66.9 MB/s eta 0:00:00 Requirement already satisfied: filelock in /home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from torch==2.0.0a0+8aa34602.nv23.03) (3.13.1) Requirement already satisfied: networkx in /home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from torch==2.0.0a0+8aa34602.nv23.03) (3.1) Requirement already satisfied: sympy in /home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from torch==2.0.0a0+8aa34602.nv23.03) (1.12) Requirement already satisfied: typing-extensions in /home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from torch==2.0.0a0+8aa34602.nv23.03) (4.9.0) Requirement already satisfied: mpmath>=0.19 in /home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages (from sympy->torch==2.0.0a0+8aa34602.nv23.03) (1.3.0) Installing collected packages: torch Attempting uninstall: torch Found existing installation: torch 2.1.2 Uninstalling torch-2.1.2: Successfully uninstalled torch-2.1.2 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. exllamav2 0.0.12 requires torch>=2.0.1, but you have torch 2.0.0a0+8aa34602.nv23.3 which is incompatible. Successfully installed torch-2.0.0a0+8aa34602.nv23.3

Brief comments: First, my environment doesn't have torch 2.1.2 installed so I'm not sure where that check is coming from. But it does look like the Nvidia latest version is shy of what exllamav2 requires?

Thank you for any commentary on this.

On Tue, Jan 30, 2024 at 4:14 AM turboderp @.***> wrote:

I think 8.7 was added to the Torch whitelist fairly late last year https://github.com/NixOS/nixpkgs/pull/249250 so I'm not sure what the status is for Torch 2.1.2.

You could try exporting TORCH_CUDA_ARCH_LIST="8.7+PTX" to see if that makes a difference. Otherwise Torch 2.2 just released, so that might behave differently, though I haven't had a chance to test it yet.

Could you clarify what you mean by the requirements install removing the NVIDIA CUDA libs? It shouldn't affect those, and if you already have torch>=2.1.0 installed (which should match against 2.1.2+cuxxx too) in your (v)env, it shouldn't affect that install.

— Reply to this email directly, view it on GitHub https://github.com/turboderp/exllamav2/issues/301#issuecomment-1916388078, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARYO7J3H6WMPMCUMGNDINO3YRC2WFAVCNFSM6AAAAABCK4GEVWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJWGM4DQMBXHA . You are receiving this because you authored the thread.Message ID: @.***>

turboderp commented 5 months ago

Oh. If that's a special version of Torch you have to use with the Jetson, you'll probably want to remove it from the requirements (or just install ninja, sentencepiece, safetensors etc. manually, it's not that many packages). Otherwise it will try to "upgrade" you to >= 2.1.0 which might default to the non-CUDA package.

I haven't tested on 2.0.0, and especially not that particular version of 2.0.0, but in theory it should still have all the features exllama would need.

So what I'd try is:

git clone https://github.com/turboderp/exllamav2
cd exllamav2
pip install pandas ninja fastparquet safetensors sentencepiece pygments websockets regex numpy
pip install .
davidtheITguy commented 5 months ago

Appreciate the info.. After removing the distro and the conda environment and starting from scratch (including manually reinstalling and testing the Jetson torch), I ran pip install . and received lots of diagnostics with this as the primary output:

Building wheels for collected packages: exllamav2 Building wheel for exllamav2 (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdistwheel did not run successfully. │ exit code: 1 ╰─> [385 lines of output] Version: 0.0.12 warning: no previously-included files matching '*.pyc' found anywhere in distribution warning: no previously-included files matching 'dni*' found anywhere in distribution

/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/command/build_py.py:204: _Warning: Package 'exllamav2.exllamav2_ext' is absent from the packages configuration.

... more more more ... and then:

Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/ssd/exllamav2/setup.py", line 76, in setup( File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/init.py", line 103, in setup return distutils.core.setup(**attrs) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup return run_commands(dist) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands dist.run_commands() File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands self.run_command(cmd) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command super().run_command(command) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 364, in run self.run_command("build") File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command self.distribution.run_command(command) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command super().run_command(command) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run self.run_command(cmd_name) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command self.distribution.run_command(command) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command super().run_command(command) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 88, in run _build_ext.run(self) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run self.build_extensions() File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions build_ext.build_extensions(self) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions self._build_extensions_serial() File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial self.build_extension(ext) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 249, in build_extension _build_ext.build_extension(self, ext) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension objects = self.compiler.compile( File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 649, in unix_wrap_ninja_compile cuda_post_cflags = unix_cuda_flags(cuda_post_cflags) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 548, in unix_cuda_flags cflags + _get_cuda_arch_flags(cflags)) File "/home/david/miniconda3/envs/exllamav2/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1786, in _get_cuda_arch_flags raise ValueError(f"Unknown CUDA arch ({arch}) or GPU not supported") ValueError: Unknown CUDA arch (8.7+PTX) or GPU not supported [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for exllamav2 Running setup.py clean for exllamav2 Failed to build exllamav2 ERROR: Could not build wheels for exllamav2, which is required to install pyproject.toml-based projects

On Tue, Jan 30, 2024 at 12:47 PM turboderp @.***> wrote:

Oh. If that's a special version of Torch you have to use with the Jetson, you'll probably want to remove it from the requirements (or just install ninja, sentencepiece, safetensors etc. manually, it's not that many packages). Otherwise it will try to "upgrade" you to >= 2.1.0 which might default to the non-CUDA package.

I haven't tested on 2.0.0, and especially not that particular version of 2.0.0, but in theory it should still have all the features exllama would need.

So what I'd try is:

git clone https://github.com/turboderp/exllamav2 cd exllamav2 pip install pandas ninja fastparquet safetensors sentencepiece pygments websockets regex numpy pip install .

— Reply to this email directly, view it on GitHub https://github.com/turboderp/exllamav2/issues/301#issuecomment-1917575959, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARYO7JYT2ESGFCN4M5NXFJDYREW4LAVCNFSM6AAAAABCK4GEVWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJXGU3TKOJVHE . You are receiving this because you authored the thread.Message ID: @.***>