feat: support model in request

francis2tm commented 5 months ago

This already happens with the default model, i.e. if there's no model present in an endpoint, edgen auto downloads the default model. The same behaviour should happen with the requested model (i.e. the value in the model attribute of the request)

The format of model should be: <hf_repo_owner>/<hf_repo>/<model_name>

Example:

TheBloke/deepseek-coder-6.7B-instruct-GGUF/deepseek-coder-6.7b-instruct.Q5_K_M.gguf

If the request model is not valid, return an error

toschoo commented 5 months ago

The model parameter in ai endpoint requests is now considered again. There are four valid cases:

model contains a path model-file, such that the <config model dir> + "/" + <model-file> exists, e.g.: "model": "my-model.gguf" => ~/.local/share/edgen/models/chat/completions/my-model.gguf In this case, the huggingface API is bypassed. If the file does not exist, the endpoint returns "No Such Model".
model contains a model in the format: owner/repo/model, e.g.: "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/tinyllama-1.1b-chat-v1.0.Q2_K.gguf" => ~/.local/share/edgen/models/chat/completions/models--TheBloke--TinyLlama-1.1B-Chat-v1.0-GGUF In this case, the huggingface API is used to identify the model and, if necessary, to download the model.
model contains "default" (case-insensitive): model as defined in the config is used. Whether the the huggingface API is used or not depends on what is configured there (a manually managed file or a huggingface-managed model).
model contains nothing (e.g. "", " ", " ", etc.): like "default".

toschoo commented 5 months ago

There is an ambiguity with the approach: if the model is in a subdirectory, e.g.: ~/.local/share/edgen/models/chat/completions/my-models/top-models/my-top-model.gguf It could be a file or a model identifier with "my-models" as owner, "top-models" as repo and "my-top-model.gguf" as model. If the file exists, the huggingface API will be by-passed. Otherwise, edgen will try to download it from huggingface.

edgenai / edgen

feat: support model in request #102