edgenai / edgen

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
https://docs.edgen.co/
Apache License 2.0
323 stars 14 forks source link

feat: edgen needs to handle 1000s of requests #98

Open francis2tm opened 5 months ago

pedro-devv commented 5 months ago

Llama is missing built-in utilities to estimate memory usage. https://github.com/ggerganov/llama.cpp/issues/4315