evilsocket / cake

Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Other
2.4k stars 124 forks source link

Dockerfile support #12

Open James4Ever0 opened 1 month ago

James4Ever0 commented 1 month ago

Hereby I have successfully compiled your project with Docker, and am willing to share with anyone struggling to do the same.

Since this software is in alpha, I advise the author to use this as reference and build official docker image for this project, before static linking and AppImage.

The filesystem structure is:

├── build.sh # build script
├── cake # cloned repository
├── cargo_config.toml # cargo mirror config
├── Dockerfile_intermediate # building intermediate image
└── run.sh # run the final container

Content of build.sh:

INTERMEDIATE_IMAGE_NAME=cake_llm_intermediate
IMAGE_NAME=cake_llm

INTERMEDIATE_CONTAINER_NAME=cake_container_intermediate
CONTAINER_NAME=cake_container

git clone https://github.com/evilsocket/cake

docker kill $CONTAINER_NAME
docker rm $CONTAINER_NAME
docker rmi $INTERMEDIATE_IMAGE_NAME

docker build -t $INTERMEDIATE_IMAGE_NAME -f Dockerfile_intermediate .

read -p "Do you want to continue? (y/n): " answer

case $answer in
    [Yy]* ) echo "You chose yes.";;
    [Nn]* ) echo "You chose no."; exit 1;;
    * ) echo "Please answer yes or no."; exit 1;;
esac

docker kill $INTERMEDIATE_CONTAINER_NAME
docker rm $INTERMEDIATE_CONTAINER_NAME

docker rmi $IMAGE_NAME
docker run -d --privileged --gpus 1 --name $INTERMEDIATE_CONTAINER_NAME $INTERMEDIATE_IMAGE_NAME tail -f /dev/null
docker exec -w /root/cake $INTERMEDIATE_CONTAINER_NAME cargo build
docker commit $INTERMEDIATE_CONTAINER_NAME $IMAGE_NAME 

docker kill $INTERMEDIATE_CONTAINER_NAME
docker rm $INTERMEDIATE_CONTAINER_NAME

Content of Dockerfile_intermediate:

FROM nvidia/cuda:12.4.0-base-ubuntu22.04

RUN rm /etc/apt/apt.conf.d/docker-clean
RUN apt update
RUN apt install -y build-essential curl

RUN apt install -y cuda-nvcc-12-4 cuda-nvrtc-dev-12-4 libcublas-dev-12-4 libcurand-dev-12-4

RUN apt install -y cargo

COPY cake /root/cake

COPY cargo_config.toml /root/.cargo/config.toml

Content of run.sh:

IMAGENAME=cake_llm
CONTAINER_NAME=cake_container

docker kill $CONTAINER_NAME
docker rm $CONTAINER_NAME

MODEL_PATH=/root/data/Meta-Llama-3-8B-Instruct
TOPOFILE=/root/data/topology.yaml

docker run -it --rm --mount type=bind,source=<source_path>,target=/root/data,ro -e LD_LIBRARY_PATH=/usr/local/cuda-12.4/targets/x86_64-linux/lib/ --name $CONTAINER_NAME --privileged --gpus 1 $IMAGENAME /root/cake/target/debug/cake-cli --model $MODEL_PATH --topology $TOPOFILE 
evilsocket commented 1 month ago

this is great work! i wondering if we could do something about the other accelerations

phken91 commented 1 month ago

this is another suggestion for this, since not every device support cuda, how about separate the dockerfile to several files like: x64, arm, rdna and more.