edgenai / edgen

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
https://docs.edgen.co/
Apache License 2.0
323 stars 14 forks source link

feat: support model in request #102

Closed francis2tm closed 5 months ago

francis2tm commented 5 months ago

This already happens with the default model, i.e. if there's no model present in an endpoint, edgen auto downloads the default model. The same behaviour should happen with the requested model (i.e. the value in the model attribute of the request)

The format of model should be: <hf_repo_owner>/<hf_repo>/<model_name>

Example:

TheBloke/deepseek-coder-6.7B-instruct-GGUF/deepseek-coder-6.7b-instruct.Q5_K_M.gguf

If the request model is not valid, return an error

toschoo commented 5 months ago

The model parameter in ai endpoint requests is now considered again. There are four valid cases:

toschoo commented 5 months ago

There is an ambiguity with the approach: if the model is in a subdirectory, e.g.: ~/.local/share/edgen/models/chat/completions/my-models/top-models/my-top-model.gguf It could be a file or a model identifier with "my-models" as owner, "top-models" as repo and "my-top-model.gguf" as model. If the file exists, the huggingface API will be by-passed. Otherwise, edgen will try to download it from huggingface.