idea: Options to select CPU and GPU binaries for llama-cpp engine

janhq / cortex.cpp

Run and customize Local LLMs.

Apache License 2.0

1.97k stars 111 forks source link

We only install an engine variant for a command/request. Option 2 can be supported straight forward by adding --cpu flag. Option 3 is more complicated.

CLI: need to support all flags check for hardware support: --cpu, --gpu, --avx, ... The priority be the same as is:
```
GPU > CPU; AVX512 > AVX2 > AVX > NOAVX
```
For example, if we want to install cpu avx2, we need to execute command: cortex engines install llama-cpp --cpu --avx2 But cortex engines install llama-cpp --avx2 implicitly means cortex will select gpu or cpu variant.

API: example for a request

{
"cpu": ["noavx", "avx", "avx2", ...], // empty means cortex auto select
"gpu": {
  "cuda": ["noavx", "avx", "avx2", "avx512"],
  "vulkan": [],
   ...
}
}

Option 3 can be extended from option 2 implementation. cc: @dan-homebrew @namchuai @louis-jan @hiento09

janhq / cortex.cpp

idea: Options to select CPU and GPU binaries for llama-cpp engine #1390

Problem Statement

Feature Idea