jasonacox / TinyLLM

Setup and run a local LLM and Chatbot using consumer grade hardware.
MIT License
120 stars 10 forks source link

Ollama GPU support on Apple Silicon #9

Open bscott opened 2 weeks ago

bscott commented 2 weeks ago

When leveraging Ollama via Docker as mentioned in Option 1 on Apple Silicon using the --gpus=all flag. Since Apple Silicon is not using Nvidia GPU's. Docker Desktop is not exposed to Apple's own GPU, and users may receive the following error message:

docker: Error response from daemon: could not select device driver "" with capabilities: [[GPU]].

Recommend if I can submit a PR to the README with the following guidance:

**Apple Silicon GPU Support**:

Apple Silicon GPUs use the Metal Performance Shaders API, which is not as widely supported as NVIDIA's CUDA API. This means that Docker, which is commonly used to run applications in containers, does not detect or utilize the Apple Silicon GPU effectively.

**Docker Limitations**:
When running Ollama in Docker on an Apple Silicon Mac, the GPU is not detected, and the system falls back to using the CPU. This is because Docker images are typically configured to use NVIDIA GPU libraries, which are not compatible with Apple Silicon GPUs.

**Native Execution**:
Running Ollama natively on macOS, without Docker, can enable GPU acceleration. This approach leverages the Metal API directly, allowing better utilization of the Apple Silicon GPU.

**Model Size and Memory Constraints**:
Large models may not fit within the GPU memory available on Apple Silicon Macs, leading to fallback to CPU usage. For efficient performance, use models that fit within the memory accessible to the GPU (approximately 10.5GB for a 16GB RAM system).
jasonacox commented 2 weeks ago

Hey @bscott ! Great find! Yes, please do. It would be an honor to have your contribution. 🙏