janhq / cortex.cpp

Run and customize Local LLMs.
https://cortex.so
Apache License 2.0
1.97k stars 111 forks source link

idea: Options to select CPU and GPU binaries for llama-cpp engine #1390

Open vansangpfiev opened 2 weeks ago

vansangpfiev commented 2 weeks ago

Problem Statement

cortex is selecting the best option base on user's hardware information. The priority is : GPU > CPU; AVX512 > AVX2 > AVX > NOAVX User can not switch between binaries.

Feature Idea

@hiento09 suggests 3 options for engines install: Option 1: no flag, install engine by current priority: GPU > CPU; AVX512 > AVX2 > AVX > NOAVX Option 2: allow forcing CPU or GPU, cortex still detects CPU instruction, for example add flag --cpu or --gpu Option 3: user can select specific binary to install

What do you think? @dan-homebrew @0xSage @namchuai @louis-jan @nguyenhoangthuan99

vansangpfiev commented 5 days ago

We only install an engine variant for a command/request. Option 2 can be supported straight forward by adding --cpu flag. Option 3 is more complicated.