janhq / cortex.cpp

Local AI API Platform
https://cortex.so
Apache License 2.0
2.03k stars 115 forks source link

epic: Cortex Model Repo supports Default Model download #1418

Closed dan-homebrew closed 4 weeks ago

dan-homebrew commented 4 weeks ago

Goal

High-level Structure

Decisions

Decision 1: Default Model Download

We need to figure out which model version cortex pull <model> and cortex runmodel` will pull.

Option 1: main branch

However, I do not think this is correct long term:

It is also incorrect to compare our approach to Ollama. Ollama uses a tag-based system similar to Docker, where latest is a pointer to 3b. It is difficult to replicate this in a "straightforward UX" in Git (i.e. tags are not very visible from main page)

Option 2: main branch has metadata.yaml

How it works

# metadata.yaml
version: 1
name: mistral
default: 3b-gguf

In the future, metadata.yaml can be more complicated, and allow for fine-grained control of CLI UX, e.g. sections for 3b, 7b, or by engine.

Furthermore, we can use metadata.yaml as a data structure to hold information about the different Model versions.

namchuai commented 4 weeks ago

I agree with Option 2. Current approach can't hold default model for other engines. Also, it caused duplication trouble between main branch and a specific branch (3b, for example).

I think we can go with option 2 before releasing because it's more future-proof.

gabrielle-ong commented 4 weeks ago

Ill update the 35 cortexso repos with metadata.yml on main branch.

@nguyenhoangthuan99 Can I get help on the recommended branches these 35 models? (listed from https://huggingface.co/cortexso)

Local models

cortexso/llama3.2 (3b-gguf-q4-km) cortexso/mistral-nemo (12b-gguf-q4-ks) cortexso/llama3 (8b-gguf-q4-ks) cortexso/llama3.1 (8b-gguf-q4-ks) cortexso/nomic-embed-text-v1 (main) - rarely used - 3 downloads last month

Future updates to default (when CI run to update branches)

cortexso/tinyllama (1b-gguf) (future 1b-gguf-q4-ks) cortexso/mistral (7b-gguf) (future 7b-gguf-q4-ks) cortexso/phi3 (mini-gguf) (future mini-gguf-q4-ks) cortexso/gemma2 (2b-gguf) (future 2b-gguf-q4-ks) cortexso/openhermes-2.5 (7b-gguf) (future 7b-gguf-q4-ks) cortexso/mixtral (7x8b-gguf) (future 7x8b-gguf-q4-ks) cortexso/yi-1.5 (34b-gguf) (future 34b-gguf-q4-ks) cortexso/aya (12.9b-gguf) (future 12.9b-gguf-q4-ks) cortexso/codestral (22b-gguf) (fututre 22b-gguf-q4-ks) cortexso/command-r (35b-gguf) (future 35b-gguf-q4-ks) cortexso/gemma (7b-gguf) (future 7b-gguf-q4-ks) cortexso/qwen2 (7b-gguf) (future 7b-gguf-q4-ks)

Remote models (no need for metadata.yml as no gguf file)

cortexso/NVIDIA-NIM (remote model, we don't have gguf file for this model) cortexso/gpt-4o-mini (remote model, we don't have gguf file for this model) cortexso/open-router-auto (remote model, we don't have gguf file for this model) cortexso/groq-mixtral-8x7b-32768 (remote model, we don't have gguf file for this model) cortexso/groq-gemma-7b-it (remote model, we don't have gguf file for this model) cortexso/groq-llama3-8b-8192 (remote model, we don't have gguf file for this model) cortexso/groq-llama3-70b-8192 (remote model, we don't have gguf file for this model) cortexso/claude-3-5-sonnet-20240620 (remote model, we don't have gguf file for this model) cortexso/gpt-3.5-turbo (remote model, we don't have gguf file for this model) cortexso/gpt-4o (remote model, we don't have gguf file for this model) cortexso/martian-model-router (remote model, we don't have gguf file for this model) cortexso/cohere-command-r-plus (remote model, we don't have gguf file for this model) cortexso/cohere-command-r (remote model, we don't have gguf file for this model) cortexso/mistral-large-latest (remote model, we don't have gguf file for this model) cortexso/mistral-small-latest (remote model, we don't have gguf file for this model) cortexso/claude-3-haiku-20240307 (remote model, we don't have gguf file for this model) cortexso/claude-3-sonnet-20240229 (remote model, we don't have gguf file for this model) cortexso/claude-3-opus-20240229 (remote model, we don't have gguf file for this model)

gabrielle-ong commented 4 weeks ago

QN: is it metadata.yaml or metadata.yml? We are currently using model.yml so I think it should be .yml to be consistent

namchuai commented 4 weeks ago

@gabrielle-ong, I agree with metadata.yml

If possible, please start with cortexso/llama3.2. Thank you!

gabrielle-ong commented 4 weeks ago

Thanks @namchuai and @nguyenhoangthuan99! added to llama3.2 and working down the list

gabrielle-ong commented 4 weeks ago

Created all the metadata.yml files in the list Categorized the list above - Noted Alex on future changes to the default branch for some models - let me know when the CI is run to add the new branches for those models

gabrielle-ong commented 3 weeks ago

Thanks @James and @nguyenhoangthuan99!