Open CyberShadow opened 8 months ago
This is the way I've been running it on NixOS using podman:
nix-shell -p podman fuse-overlayfs --run "podman run --rm -ti --device=/dev/kfd --device=/dev/dri -e DISPLAY=${DISPLAY} -v /tmp/.X11-unix/X0:/tmp/.X11-unix/X0 -v /home:/home -p \"8080:8080\" docker.io/rocm/pytorch bash ~/Downloads/Meta-Llama-3-70B-Instruct.Q8_0.llamafile -ngl 10 --host \"0.0.0.0\""
For llama3 70B, each layer takes about 1GB of VRAM, so can only do ~14 layers on a 6900XT.
This creates a script that allows running a
.llamafile
with ROCm acceleration under Nix:Can be used like
nix-build && ./result ~/Downloads/wizardcoder-python-34b-v1.0.Q5_K_M.llamafile -ngl 9999
Posting this here with the hope that this will help someone. Thank you for this project!