Atinoda / text-generation-webui-docker

Docker variants of oobabooga's text-generation-webui, including pre-built images.
GNU Affero General Public License v3.0
398 stars 77 forks source link

default text-generation-webui:default-rocm is broken #57

Open rshxyz opened 1 month ago

rshxyz commented 1 month ago

When I try to run the default-rocm docker images it fails with the issue below, i believe i set up docker compose correctly

text-generation-webui  | RuntimeError: Failed to import transformers.generation.utils because of the 
text-generation-webui  | following error (look up to see its traceback):
text-generation-webui  | cannot import name 'get_cuda_stream' from 'triton.runtime.jit' 
text-generation-webui  | (/venv/lib/python3.10/site-packages/triton/runtime/jit.py)
text-generation-webui exited with code 1
cat docker-compose.yml 
services:
  text-generation-webui-docker:
    image: atinoda/text-generation-webui:default-rocm # Specify variant as the :tag
    container_name: text-generation-webui
    environment:
      - EXTRA_LAUNCH_ARGS="--listen --verbose" # Custom launch args (e.g., --model MODEL_NAME)
#      - BUILD_EXTENSIONS_LIVE="coqui_tts whisper_stt" # Install named extensions during every container launch. THIS WILL SIGNIFICANLTLY SLOW LAUNCH TIME AND IS NORMALLY NOT REQUIRED.
#      - OPENEDAI_EMBEDDING_MODEL=intfloat/e5-large-v2  # Specify custom model for embeddings
#      - OPENEDAI_EMBEDDING_DEVICE=cuda  # Specify processing device for embeddings
    ports:
      - 7860:7860  # Default web port
      - 5000:5000  # Default API port
#      - 5005:5005  # Default streaming port
    volumes:
      - ./config/cache:/root/.cache  # WARNING: Libraries may save large files here!
      - ./config/characters:/app/characters
      - ./config/instruction-templates:/app/instruction-templates
      - ./config/loras:/app/loras
      - ./config/models:/app/models  # WARNING - very large files!
      - ./config/presets:/app/presets
      - ./config/prompts:/app/prompts
      - ./config/training:/app/training
#      - ./config/extensions:/app/extensions  # Persist all extensions
#      - ./config/extensions/coqui_tts:/app/extensions/coqui_tts  # Persist a single extension
    logging:
      driver:  json-file
      options:
        max-file: "3"   # number of files or file count
        max-size: "10M"

    ### HARDWARE ACCELERATION: comment or uncomment according to your hardware! ###

    ### CPU only ###
    # Nothing required - comment out the other hardware sections.

    ### Nvidia (default) ###

    ### AMD ROCM or Intel Arc ###
    stdin_open: true
    group_add:
      - video
    tty: true
    ipc: host
    devices:
      - /dev/kfd
      - /dev/dri 
    cap_add: 
      - SYS_PTRACE
    security_opt:
      - seccomp=unconfined
text-generation-webui  | === Running text-generation-webui variant: 'ROCM Extended' v1.14 ===
text-generation-webui  | === (This version is 4 commits behind origin main) ===
text-generation-webui  | === Image build date: 2024-08-20 21:23:03 ===
text-generation-webui  | amdgpu.ids: No such file or directory
text-generation-webui  | amdgpu.ids: No such file or directory
text-generation-webui  | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮
text-generation-webui  | │ /venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:1603   │
text-generation-webui  | │ in _get_module                                                               │
text-generation-webui  | │                                                                              │
text-generation-webui  | │   1602         try:                                                          │
text-generation-webui  | │ ❱ 1603             return importlib.import_module("." + module_name, self.__ │
text-generation-webui  | │   1604         except Exception as e:                                        │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /usr/lib/python3.10/importlib/__init__.py:126 in import_module               │
text-generation-webui  | │                                                                              │
text-generation-webui  | │   125             level += 1                                                 │
text-generation-webui  | │ ❱ 126     return _bootstrap._gcd_import(name[level:], package, level)        │
text-generation-webui  | │   127                                                                        │
text-generation-webui  | │ in _gcd_import:1050                                                          │
text-generation-webui  | │ in _find_and_load:1027                                                       │
text-generation-webui  | │ in _find_and_load_unlocked:1006                                              │
text-generation-webui  | │                                                                              │
text-generation-webui  | │                           ... 17 frames hidden ...                           │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/torch/_inductor/codegen/cuda/gemm_templat │
text-generation-webui  | │ e.py:11 in <module>                                                          │
text-generation-webui  | │                                                                              │
text-generation-webui  | │    10 from . import cutlass_utils                                            │
text-generation-webui  | │ ❱  11 from .cuda_kernel import CUDATemplateKernel                            │
text-generation-webui  | │    12 from .cuda_template import CUTLASSTemplate                             │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/torch/_inductor/codegen/cuda/cuda_kernel. │
text-generation-webui  | │ py:7 in <module>                                                             │
text-generation-webui  | │                                                                              │
text-generation-webui  | │     6 from ...ir import Buffer, CUDATemplateBuffer, IRNode, Layout, TensorBo │
text-generation-webui  | │ ❱   7 from ...select_algorithm import ChoiceCaller                           │
text-generation-webui  | │     8 from ...utils import sympy_product                                     │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py:24 in │
text-generation-webui  | │ <module>                                                                     │
text-generation-webui  | │                                                                              │
text-generation-webui  | │     23 from .codegen.common import ChoiceCaller, IndentedBuffer, KernelTempl │
text-generation-webui  | │ ❱   24 from .codegen.triton import texpr, TritonKernel, TritonPrinter, Trito │
text-generation-webui  | │     25 from .codegen.triton_utils import config_of, signature_to_meta        │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/torch/_inductor/codegen/triton.py:31 in   │
text-generation-webui  | │ <module>                                                                     │
text-generation-webui  | │                                                                              │
text-generation-webui  | │     30 from ..scheduler import BaseScheduling, WhyNoFuse                     │
text-generation-webui  | │ ❱   31 from ..triton_heuristics import AutotuneHint                          │
text-generation-webui  | │     32 from ..utils import (                                                 │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py:54   │
text-generation-webui  | │ in <module>                                                                  │
text-generation-webui  | │                                                                              │
text-generation-webui  | │     53 if has_triton():                                                      │
text-generation-webui  | │ ❱   54     from triton.runtime.jit import get_cuda_stream                    │
text-generation-webui  | │     55 else:                                                                 │
text-generation-webui  | ╰──────────────────────────────────────────────────────────────────────────────╯
text-generation-webui  | ImportError: cannot import name 'get_cuda_stream' from 'triton.runtime.jit' 
text-generation-webui  | (/venv/lib/python3.10/site-packages/triton/runtime/jit.py)
text-generation-webui  | 
text-generation-webui  | The above exception was the direct cause of the following exception:
text-generation-webui  | 
text-generation-webui  | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮
text-generation-webui  | │ /app/server.py:40 in <module>                                                │
text-generation-webui  | │                                                                              │
text-generation-webui  | │    39 import modules.extensions as extensions_module                         │
text-generation-webui  | │ ❱  40 from modules import (                                                  │
text-generation-webui  | │    41     chat,                                                              │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /app/modules/chat.py:26 in <module>                                          │
text-generation-webui  | │                                                                              │
text-generation-webui  | │     25 from modules.logging_colors import logger                             │
text-generation-webui  | │ ❱   26 from modules.text_generation import (                                 │
text-generation-webui  | │     27     generate_reply,                                                   │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /app/modules/text_generation.py:19 in <module>                               │
text-generation-webui  | │                                                                              │
text-generation-webui  | │    18 import modules.shared as shared                                        │
text-generation-webui  | │ ❱  19 from modules import models                                             │
text-generation-webui  | │    20 from modules.cache_utils import process_llamacpp_cache                 │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /app/modules/models.py:59 in <module>                                        │
text-generation-webui  | │                                                                              │
text-generation-webui  | │    58                                                                        │
text-generation-webui  | │ ❱  59 sampler_hijack.hijack_samplers()                                       │
text-generation-webui  | │    60                                                                        │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /app/modules/sampler_hijack.py:554 in hijack_samplers                        │
text-generation-webui  | │                                                                              │
text-generation-webui  | │   553 def hijack_samplers():                                                 │
text-generation-webui  | │ ❱ 554     transformers.GenerationMixin._get_logits_warper_old = transformers │
text-generation-webui  | │   555     transformers.GenerationMixin._get_logits_warper = get_logits_warpe │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:1594   │
text-generation-webui  | │ in __getattr__                                                               │
text-generation-webui  | │                                                                              │
text-generation-webui  | │   1593             module = self._get_module(self._class_to_module[name])    │
text-generation-webui  | │ ❱ 1594             value = getattr(module, name)                             │
text-generation-webui  | │   1595         else:                                                         │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:1593   │
text-generation-webui  | │ in __getattr__                                                               │
text-generation-webui  | │                                                                              │
text-generation-webui  | │   1592         elif name in self._class_to_module.keys():                    │
text-generation-webui  | │ ❱ 1593             module = self._get_module(self._class_to_module[name])    │
text-generation-webui  | │   1594             value = getattr(module, name)                             │
text-generation-webui  | │                                                                              │
text-generation-webui  | │ /venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:1605   │
text-generation-webui  | │ in _get_module                                                               │
text-generation-webui  | │                                                                              │
text-generation-webui  | │   1604         except Exception as e:                                        │
text-generation-webui  | │ ❱ 1605             raise RuntimeError(                                       │
text-generation-webui  | │   1606                 f"Failed to import {self.__name__}.{module_name} beca │
text-generation-webui  | ╰──────────────────────────────────────────────────────────────────────────────╯
text-generation-webui  | RuntimeError: Failed to import transformers.generation.utils because of the 
text-generation-webui  | following error (look up to see its traceback):
text-generation-webui  | cannot import name 'get_cuda_stream' from 'triton.runtime.jit' 
text-generation-webui  | (/venv/lib/python3.10/site-packages/triton/runtime/jit.py)
text-generation-webui exited with code 1
Atinoda commented 1 month ago

Unfortunately I cannot do any testing here because I don't have a ROCM capable card. The docker-compose looks correct. Please could you answer the following questions -

Any and all ROCM users are invited to comment if they have insights to share!

rshxyz commented 1 month ago

Are you running Linux or Windows? Linux Operating System: Rocky Linux 9.4 (Blue Onyx)
Kernel: Linux 5.14.0-427.37.1.el9_4.x86_64 What GPU are you using? Radeon™ RX 7800 XT. Has this container ever worked for you with your GPU? No, first time trying it. Have you tried previous versions of the image? No i have not.

I ended up using Ollama docker which worked out of the box, you might be able to figure out how to do the same but idk https://github.com/ollama/ollama/blob/main/docs/docker.md https://github.com/ollama/ollama/blob/main/Dockerfile

Atinoda commented 1 month ago

Thank you for the info and the links to the ollamaDockerfile. I see that they're using the rocm/* base image, which I may give a shot. Unfortunately textgen is put together a bit differently to ollama, so the runtime image would have be rocm and would end up even heavier!

If you're still interested, you could try an earlier known-working version of this image - last report I know of is for default-rocm-snapshot-2024-02-18. However, I expect there's some versions after that still working.