getumbrel / llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
https://apps.umbrel.com/app/llama-gpt
MIT License
10.53k stars 666 forks source link

feat: add amd support #114

Open ParthJadhav opened 9 months ago

Yobb17 commented 8 months ago

https://www.remotasks.com/internal/login/facebook/callback?code=AQD1EJXqfiRkFIUj53lIxSjOEsFGQ8WnzPzFML-uQsoMfnUsmKMZ3vpvm-N2DcCcNVJsfTgjtqYVvb5rMr8.

AnttiRae commented 8 months ago

Hey! I'm not sure if I did this correctly but here's my output from running ./run.sh --model 7b --with-rocm. It seems that something went wrong with docker not detecting my GPU. Let me know if there's something more specific I should test. I'm running Fedora 38 with AMD RX 7900 XTX as the GPU.

Edit: I tried with Windows wsl2 (ubuntu) as well and got the same error.

cotsuka commented 7 months ago

Finally found time to test this PR. Looks like I'm running into DNS resolution issues hitting several of the repos. I ran the same command as @AnttiRae above. I'll attempt to retest later to see if that clears up.

image

cweiske commented 5 months ago

Works here.

Speed depends on the GPU; here my GPU is slower than the CPU. CPU: AMD Ryzen 7 7700, 16 cores with 64GiB RAM GPU: AMD Radeon RX 7600, 8GB

Sample request timings with 70b:

llama-gpt-api-rocm-ggml-1  | llama_print_timings:       total time = 225636.50 ms
llama-gpt-api-1            | llama_print_timings:       total time = 160374.53 ms