meta-llama / llama-stack

Composable building blocks to build Llama Apps
MIT License
4.57k stars 575 forks source link

Create the distribution of AMD ROCm GPU #341

Open alexhegit opened 2 weeks ago

alexhegit commented 2 weeks ago

🚀 The feature, motivation and pitch

Create the distribution of AMD ROCm GPU like the distributions/meta-reference-gpu which is base on NVIDIA GPU.

Alternatives

No response

Additional context

No response

alexhegit commented 2 weeks ago

Here is my compose.yaml for creating the AMD ROCm GPU distribution.

services:
  llamastack:
    image: rocm/pytorch 
    network_mode: "host"
    volumes:
      - ~/.llama:/root/.llama
      - ./run.yaml:/root/my-run.yaml
    ports:
      - "5000:5000"
    devices:
      - /dev/kfd
      - /dev/dri
    security_opt:
      - seccomp:unconfined
    group_add:
      - video
    environment:
      - HIP_VISIBLE_DEVICES=0
    command: []
    entrypoint: bash -c "python -m llama_stack.distribution.server.server --yaml_config /root/my-run.yaml"
    deploy:
      restart_policy:
        condition: on-failure
        delay: 3s
        max_attempts: 5
        window: 60s

And I do 'docker compose up' with it and got failed log as bellow.

$ docker compose up
[+] Running 2/1
 ✔ Container meta-reference-gpu-rocm-llamastack-1                        Created                                                                                                           0.4s 
 ! llamastack Published ports are discarded when using host network mode                                                                                                                   0.0s 
Attaching to llamastack-1
llamastack-1  | /opt/conda/envs/py_3.10/bin/python: Error while finding module specification for 'llama_stack.distribution.server.server' (ModuleNotFoundError: No module named 'llama_stack')
llamastack-1 exited with code 0

Any help to fix it?

yanxi0830 commented 2 weeks ago

The image you are using is image: rocm/pytorch, which do not have the llama_stack package. You can checkout some of the pre-built llamastack images in this folder: https://github.com/meta-llama/llama-stack/tree/main/distributions

Example llamastack/distribution-meta-reference-gpu compose file: https://github.com/meta-llama/llama-stack/blob/main/distributions/meta-reference-gpu/compose.yaml

alexhegit commented 1 week ago

The image you are using is image: rocm/pytorch, which do not have the llama_stack package. You can checkout some of the pre-built llamastack images in this folder: https://github.com/meta-llama/llama-stack/tree/main/distributions

Example llamastack/distribution-meta-reference-gpu compose file: https://github.com/meta-llama/llama-stack/blob/main/distributions/meta-reference-gpu/compose.yaml

I know this prebuilt llamastack images but they are support NVIDIA GPU or X86 CPU. What I am trying is to use rocm/pytorch docker image to enable AMD ROCm GPU with LlamaStack. Here is my patch to create the compose.yaml for ROCm, https://github.com/alexhegit/llama-stack-rocm/tree/rocm-dev/distributions/meta-reference-gpu-rocm in my fork repo.

It seems that the prebuilt image "llamastack/distribution-meta-reference-gpu" has installed Llama-stack package besides CUDA libs? Any details about how this prebuilt image be created? I may copy the way to create distribution-meta-reference-gpu-rocm base on the image "rocm/pytorch" for AMD GPU.