janhq / cortex.cpp

Local AI API Platform
https://cortex.so
Apache License 2.0
2.1k stars 122 forks source link

feat: Model Pull has clear API and CLI to support Huggingface Repos #1242

Closed dan-homebrew closed 1 month ago

dan-homebrew commented 2 months ago

Goal

Tasklist

CLI

# Pulls immediately
cortex model pull <huggingface_url_sppecific_gguf>

# Lets user select quantization using CLI
cortex model pull <huggingface_url> 

# NOT SURE: Do we need an "info" equivalent?
# Gets repo type (e.g. GGUF, in future ONNX, TensorRT-LLM, dumps possible versions)
# Will power the "select quantization"
cortex model info <huggingface_url>
cortex model info <cortex_repo_url>    # Dumps tags

API

Key Questions

Linked Issues

Jan's Requirements

  1. User enters Huggingface URL in import box
  2. User clicks deep link from Huggingface

Cortex should support an API, that can support the following UI:

Image

namchuai commented 2 months ago

@dan-homebrew , are you sure about the command cortex model pull? Because, in cortexjs, I think we're using cortex pull. Also, ollama is using ollama pull.

One thing to note from the API: http://127.0.0.1:1337/v1/models/__MODELID__/pull. The __MODELID__ has to be put inside body of the request. This is because Drogon does not play nice if __MODELID__ contains slash.

namchuai commented 2 months ago

We already handled both cases above and allowing user to select a quant that they want to use. However, for now, we only support GGUF and single GGUF file repository (multile GGUF for a model is not supported at the moment)

namchuai commented 2 months ago
dan-homebrew commented 1 month ago

@namchuai Can I check: can I close this issue, it seems we have implemented this already?

namchuai commented 1 month ago

@dan-homebrew , this one still need my last PR to support stop download. The PR is https://github.com/janhq/cortex.cpp/pull/1460 , need some testing before I mark it Ready.

gabrielle-ong commented 1 month ago

QAing: API for Huggingface repo

1. POST models/pull starts download 2. POST models/pull halfway shows downloaded bytes

image

3. DELETE models/pull stops the download

curl --location --request DELETE 'http://127.0.0.1:39281/models/pull' \
--header 'Content-Type: application/json' \
--data '{
    "taskId": <your_download_task_id>
}'
image

4. web sockets events emitted when model pull starts /events 5. web sockets events stopped when model pull stops /events image

Pending task to add DELETE models/pull to Swagger & docs @gabrielle-ong

gabrielle-ong commented 1 month ago

Closing this huge epic, thanks @namchuai! Minor UX issue of stdout sent to CLI (inactive terminal), tracking in #1519