gpustack / llama-box

LLM inference server implementation based on llama.cpp.
MIT License
34 stars 5 forks source link