h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
http://h2o.ai
Apache License 2.0
11.4k stars 1.25k forks source link

H2O-GPT on AMD GPUs (ROCm) #1812

Open rohitnanda1443 opened 2 months ago

rohitnanda1443 commented 2 months ago

Hi, How can we run H20-GPT on AMD-GPUs using the AMD ROCm libraries.

One can easily run an inference server on Ollama using ROCm thereby H2O-GPT needs to use this Ollama server for inferencing.

Problem: H2o-GPT install fails as it keeps finding CUDA during install. Some guidance here on editing the install script for ROCm would be helpful,

Method: 1) LLM runs on an inference server using ROCm 2) H2o-GPT sends LLM requests to the inference server

pseudotensor commented 2 months ago

Can you share what you mean by it finds CUDA during install and fails? Maybe logs etc.?

I adjusted one block In docs/linux_install.sh CUDA is mentioned.

rohitnanda1443 commented 2 months ago

It should not be uninstalling ROCm-Torch

`` Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). /tmp/pip-install-jav98t1i/flash-attn_c0c8ed92b3c147bfa04d7e6ab7c98f49/setup.py:95: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc. warnings.warn( Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/tmp/pip-install-jav98t1i/flash-attn_c0c8ed92b3c147bfa04d7e6ab7c98f49/setup.py", line 179, in CUDAExtension( File "/home/rohit/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1074, in CUDAExtension library_dirs += library_paths(cuda=True) File "/home/rohit/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1201, in library_paths if (not os.path.exists(_join_cuda_home(lib_dir)) and File "/home/rohit/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2407, in _join_cuda_home raise OSError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

  torch.__version__  = 2.2.1+cu121

  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

note: This is an issue with the package mentioned above, not pip. hint: See above for details.

Attempting uninstall: torch Found existing installation: torch 2.5.0.dev20240822+rocm6.1 Uninstalling torch-2.5.0.dev20240822+rocm6.1: Successfully uninstalled torch-2.5.0.dev20240822+rocm6.1 Attempting uninstall: sse_starlette Found existing installation: sse-starlette 0.10.3 Uninstalling sse-starlette-0.10.3: Successfully uninstalled sse-starlette-0.10.3 Attempting uninstall: torchvision Found existing installation: torchvision 0.20.0.dev20240823+rocm6.1 Uninstalling torchvision-0.20.0.dev20240823+rocm6.1: Successfully uninstalled torchvision-0.20.0.dev20240823+rocm6.1 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tts 0.22.0 requires numpy==1.22.0; python_version <= "3.10", but you have numpy 1.26.4 which is incompatible. tts 0.22.0 requires pandas<2.0,>=1.4, but you have pandas 2.2.2 which is incompatible. awscli 1.34.5 requires docutils<0.17,>=0.10, but you have docutils 0.21.2 which is incompatible. fiftyone 0.25.0 requires sse-starlette<1,>=0.10.3, but you have sse-starlette 2.1.3 which is incompatible. torchaudio 2.4.0.dev20240823+rocm6.1 requires torch==2.5.0.dev20240822, but you have torch 2.2.1 which is incompatible. vllm 0.5.5+rocm614 requires pydantic>=2.8, but you have pydantic 2.7.0 which is incompatible. Successfully installed docutils-0.21.2 pandas-2.2.2 pydantic-2.7.0 pydantic-core-2.18.1 pypandoc_binary-1.13 sse_starlette-2.1.3 torch-2.2.1 torchvision-0.17.1 ``

rohitnanda1443 commented 1 month ago

Do we have an ROCm Docker image?

pseudotensor commented 1 month ago

We don't build one, but you can build one.