c0sogi / llama-api

An OpenAI-like LLaMA inference API
MIT License
111 stars 9 forks source link

Dev update (23.8.9.) #3

Closed c0sogi closed 1 year ago

c0sogi commented 1 year ago

This PR encompasses several enhancements to usability and code refactoring. The primary changes include:

  1. Skip compilation: You can skip compilation of llama.cpp shared library when running server with --install-pkgs. Just add --skip-compile option.
  2. Removed auto process kill feature: Killing process when unloading model, was introduced to prevent the program from memory leak, but this sometimes make the program exit for no reason. So this feature is removed.
  3. API key checker: API key checker will be activated if you start the server with option --api-key YOUR_API_KEY. Client must include Authorization header with Bearer YOUR_API_KEY.