awslabs / llm-hosting-container

Large Language Model Hosting Container
Apache License 2.0
75 stars 32 forks source link

Not able to get the files copied inside Dockerfile #56

Open byash11 opened 8 months ago

byash11 commented 8 months ago

While building the image the build is failing due to unavailibility of several files like proto, benchmark, Makefile-flash-att-v2 Makefile, Makefile-eetq Makefile etc on the repo. Please kindly let me know if there is a url or a repo which I can refer to get the files copied inside Dockerfile ..

Also if there is an alternative solution to resolve this issue please let me know.

The error i am getting is " COPY failed : file not found in build context or excluded by .dockerignore: stat proto: file does not exist"

forresty commented 7 months ago

I think you will need to copy the huggingface folder into your copy of text-generation-interface and build from there.

It builds now, but I can't find the missing recipe.json anywhere:

UPDATE: per https://github.com/LukeMathWalker/cargo-chef it seems cargo chef prepare --recipe-path recipe.json generates the recipe.json

~/source/text-generation-inference tags/v1.4.0* 31s
base ❯ docker build -f huggingface/pytorch/tgi/docker/1.4.0/Dockerfile .
[+] Building 1.3s (24/78)                                                                                      docker:orbstack
 => [internal] load build definition from Dockerfile                                                                      0.0s
 => => transferring dockerfile: 8.98kB                                                                                    0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.1.0-base-ubuntu20.04                                            1.0s
 => [internal] load metadata for docker.io/lukemathwalker/cargo-chef:latest-rust-1.71                                     0.9s
 => [internal] load metadata for docker.io/nvidia/cuda:12.1.0-devel-ubuntu20.04                                           1.0s
 => [internal] load .dockerignore                                                                                         0.0s
 => => transferring context: 94B                                                                                          0.0s
 => [base  1/23] FROM docker.io/nvidia/cuda:12.1.0-base-ubuntu20.04@sha256:191e7e562d485c2adaf6fecab0a65af69d31d1858f97b  0.0s
 => [internal] load build context                                                                                         0.0s
 => => transferring context: 12.93kB                                                                                      0.0s
 => [pytorch-install 1/5] FROM docker.io/nvidia/cuda:12.1.0-devel-ubuntu20.04@sha256:0678daf5d9f3800c714df6c3983d3106515  0.0s
 => [chef 1/2] FROM docker.io/lukemathwalker/cargo-chef:latest-rust-1.71@sha256:1d0c049f0f6007531a1fca3f3d81ee571fc46135  0.0s
 => CACHED [base  2/23] WORKDIR /usr/src                                                                                  0.0s
 => CACHED [base  3/23] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends   0.0s
 => CACHED [pytorch-install 2/5] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-re  0.0s
 => CACHED [pytorch-install 3/5] RUN case linux/arm64 in     "linux/arm64")  MAMBA_ARCH=aarch64  ;;     *)                0.0s
 => CACHED [pytorch-install 4/5] RUN chmod +x ~/mambaforge.sh &&     bash ~/mambaforge.sh -b -p /opt/conda &&     rm ~/m  0.0s
 => ERROR [pytorch-install 5/5] RUN case linux/arm64 in     "linux/arm64")  exit 1 ;;     *)              /opt/conda/bin  0.1s
 => CACHED [chef 2/2] WORKDIR /usr/src                                                                                    0.0s
 => CACHED [planner 1/8] COPY Cargo.toml Cargo.toml                                                                       0.0s
 => CACHED [planner 2/8] COPY Cargo.lock Cargo.lock                                                                       0.0s
 => CACHED [planner 3/8] COPY rust-toolchain.toml rust-toolchain.toml                                                     0.0s
 => CACHED [planner 4/8] COPY proto proto                                                                                 0.0s
 => CACHED [planner 5/8] COPY benchmark benchmark                                                                         0.0s
 => CACHED [planner 6/8] COPY router router                                                                               0.0s
 => CACHED [planner 7/8] COPY launcher launcher                                                                           0.0s
 => CANCELED [planner 8/8] RUN cargo chef prepare --recipe-path recipe.json                                               0.2s
------                                                                                                                         
 > [pytorch-install 5/5] RUN case linux/arm64 in     "linux/arm64")  exit 1 ;;     *)              /opt/conda/bin/conda update -y conda &&      /opt/conda/bin/conda install -c "pytorch" -c "nvidia" -y "python=3.10" "pytorch=2.1.1" "pytorch-cuda=$(echo 12.1 | cut -d'.' -f 1-2)"  ;;     esac &&     /opt/conda/bin/conda clean -ya:
------
Dockerfile:78
--------------------
  77 |     # On arm64 we exit with an error code
  78 | >>> RUN case ${TARGETPLATFORM} in \
  79 | >>>     "linux/arm64")  exit 1 ;; \
  80 | >>>     *)              /opt/conda/bin/conda update -y conda &&  \
  81 | >>>     /opt/conda/bin/conda install -c "${INSTALL_CHANNEL}" -c "${CUDA_CHANNEL}" -y "python=${PYTHON_VERSION}" "pytorch=$PYTORCH_VERSION" "pytorch-cuda=$(echo $CUDA_VERSION | cut -d'.' -f 1-2)"  ;; \
  82 | >>>     esac && \
  83 | >>>     /opt/conda/bin/conda clean -ya
  84 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c case ${TARGETPLATFORM} in     \"linux/arm64\")  exit 1 ;;     *)              /opt/conda/bin/conda update -y conda &&      /opt/conda/bin/conda install -c \"${INSTALL_CHANNEL}\" -c \"${CUDA_CHANNEL}\" -y \"python=${PYTHON_VERSION}\" \"pytorch=$PYTORCH_VERSION\" \"pytorch-cuda=$(echo $CUDA_VERSION | cut -d'.' -f 1-2)\"  ;;     esac &&     /opt/conda/bin/conda clean -ya" did not complete successfully: exit code: 1
byash11 commented 7 months ago

Hi ,

Alright I will proceed with the method you suggested.

Thank you .

On Fri, Feb 23, 2024, 5:44 AM Feng Ye @.***> wrote:

I think you will need to copy the huggingface folder into your copy of text-generation-interface and build from there.

It builds now, but I can't find the missing recipe.json anywhere:

~/source/text-generation-inference tags/v1.4.0* 31s base ❯ docker build -f huggingface/pytorch/tgi/docker/1.4.0/Dockerfile . [+] Building 1.3s (24/78) docker:orbstack => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 8.98kB 0.0s => [internal] load metadata for docker.io/nvidia/cuda:12.1.0-base-ubuntu20.04 1.0s => [internal] load metadata for docker.io/lukemathwalker/cargo-chef:latest-rust-1.71 0.9s => [internal] load metadata for docker.io/nvidia/cuda:12.1.0-devel-ubuntu20.04 1.0s => [internal] load .dockerignore 0.0s => => transferring context: 94B 0.0s => [base 1/23] FROM @.:191e7e562d485c2adaf6fecab0a65af69d31d1858f97b 0.0s => [internal] load build context 0.0s => => transferring context: 12.93kB 0.0s => [pytorch-install 1/5] FROM @.:0678daf5d9f3800c714df6c3983d3106515 0.0s => [chef 1/2] FROM @.**:1d0c049f0f6007531a1fca3f3d81ee571fc46135 0.0s => CACHED [base 2/23] WORKDIR /usr/src 0.0s => CACHED [base 3/23] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends 0.0s => CACHED [pytorch-install 2/5] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-re 0.0s => CACHED [pytorch-install 3/5] RUN case linux/arm64 in "linux/arm64") MAMBA_ARCH=aarch64 ;; ) 0.0s => CACHED [pytorch-install 4/5] RUN chmod +x ~/mambaforge.sh && bash ~/mambaforge.sh -b -p /opt/conda && rm ~/m 0.0s => ERROR [pytorch-install 5/5] RUN case linux/arm64 in "linux/arm64") exit 1 ;; *) /opt/conda/bin 0.1s => CACHED [chef 2/2] WORKDIR /usr/src 0.0s => CACHED [planner 1/8] COPY Cargo.toml Cargo.toml 0.0s => CACHED [planner 2/8] COPY Cargo.lock Cargo.lock 0.0s => CACHED [planner 3/8] COPY rust-toolchain.toml rust-toolchain.toml 0.0s => CACHED [planner 4/8] COPY proto proto 0.0s => CACHED [planner 5/8] COPY benchmark benchmark 0.0s => CACHED [planner 6/8] COPY router router 0.0s => CACHED [planner 7/8] COPY launcher launcher 0.0s => CANCELED [planner 8/8] RUN cargo chef prepare --recipe-path recipe.json 0.2s

[pytorch-install 5/5] RUN case linux/arm64 in "linux/arm64") exit 1 ;; *) /opt/conda/bin/conda update -y conda && /opt/conda/bin/conda install -c "pytorch" -c "nvidia" -y "python=3.10" "pytorch=2.1.1" "pytorch-cuda=$(echo 12.1 | cut -d'.' -f 1-2)" ;; esac && /opt/conda/bin/conda clean -ya:

Dockerfile:78

77 | # On arm64 we exit with an error code 78 | >>> RUN case ${TARGETPLATFORM} in \ 79 | >>> "linux/arm64") exit 1 ;; \ 80 | >>> *) /opt/conda/bin/conda update -y conda && \ 81 | >>> /opt/conda/bin/conda install -c "${INSTALL_CHANNEL}" -c "${CUDA_CHANNEL}" -y "python=${PYTHON_VERSION}" "pytorch=$PYTORCH_VERSION" "pytorch-cuda=$(echo $CUDA_VERSION | cut -d'.' -f 1-2)" ;; \ 82 | >>> esac && \ 83 | >>> /opt/conda/bin/conda clean -ya 84 |

ERROR: failed to solve: process "/bin/sh -c case ${TARGETPLATFORM} in \"linux/arm64\") exit 1 ;; *) /opt/conda/bin/conda update -y conda && /opt/conda/bin/conda install -c \"${INSTALL_CHANNEL}\" -c \"${CUDA_CHANNEL}\" -y \"python=${PYTHON_VERSION}\" \"pytorch=$PYTORCH_VERSION\" \"pytorch-cuda=$(echo $CUDA_VERSION | cut -d'.' -f 1-2)\" ;; esac && /opt/conda/bin/conda clean -ya" did not complete successfully: exit code: 1

— Reply to this email directly, view it on GitHub https://github.com/awslabs/llm-hosting-container/issues/56#issuecomment-1960549725, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXMT75JUNTS3P4EWK6AQOXTYU7NOVAVCNFSM6AAAAABC6CD7HKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRQGU2DSNZSGU . You are receiving this because you authored the thread.Message ID: @.***>

forresty commented 7 months ago

Also, I think the Dockerfile is almost a direct copy from https://github.com/huggingface/text-generation-inference/blob/main/Dockerfile

byash11 commented 7 months ago

Thanks, I'll check it out.

On Sat, Feb 24, 2024 at 6:03 AM Feng Ye @.***> wrote:

Also, I think the Dockerfile is almost a direct copy from https://github.com/huggingface/text-generation-inference/blob/main/Dockerfile

— Reply to this email directly, view it on GitHub https://github.com/awslabs/llm-hosting-container/issues/56#issuecomment-1962176792, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXMT75PVOYXWZQQ66E5MCCDYVEYPDAVCNFSM6AAAAABC6CD7HKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRSGE3TMNZZGI . You are receiving this because you authored the thread.Message ID: @.***>