Open dan-homebrew opened 1 week ago
Every single provider extension registers a list of models. Then they all persisted to /models
data folder. On load, Jan app scans the folder and lists onto the model hub.
Recent update: Jan app caches these models after registration for a better UX then attempts to scan in the background again to ensure.
All the cortex repositories are fetched during the CI app build to generate model.json files, so accessing ModelHub does not require connectivity, which uses the same mechanism as pre-populated model.json files.
We are implementing a Model Hub API to enable organizations and users to add external model hubs (HuggingFace, NGC Cloud, etc.) to their local hub.
.cortexrc
The new Models
table will include the following fields:
model_id
(Unique identifier for the model)organization
(e.g., cortexso)repo
(e.g., llama3.1)branch
(e.g., tensorrt-llm-linux-ada)path_to_model_yml
(Local path to model.yml file, can be empty for undownloaded models)status
(available/unavailable based on download state)This endpoint can also be reused for remote engine provider like openai, claude, ... The token will be saved in .cortexrc
or we will create a separate .authentication
file for all secret tokens for every provider?
POST /v1/auth/token
{
"provider": "huggingface",
"token": "your_token_here"
}
# Add organization
POST /v1/models/organizations
{
"name": "cortexso",
"provider": "huggingface"
}
# List organizations
GET /v1/models/organizations
# Add repository
POST /v1/models/repositories
{
"organization": "cortexso",
"name": "llama3.1",
"include_all_branches": true
}
# List repositories for an organization
GET /v1/models/organizations/{org_name}/repositories
# List models with filtering
GET /v1/models?organization=cortexso&repo=llama3.1&branch=main&status=available
# Get model by ID
GET /v1/models/{model_id}
# Pull/download model
POST /v1/models/pull
Do we need API for tree view or front end will handle it?
# Get organization tree view
GET /api/v1/views/organization-tree
Response:
{
"organizations": [
{
"name": "cortexso",
"repositories": [
{
"name": "llama3.1",
"branches": [
{
"name": "tensorrt-llm-linux-ada",
"models": [model_id, ... ]
}
]
}
]
}
]
}
Model Type Detection Logic:
Download Process: Use existing Download service in cortex.cpp
{
"error": {
"code": "ERROR_CODE",
"message": "Human readable error message",
"details": {}
}
}
Goal