janhq / cortex.cpp

Local AI API Platform
https://cortex.so
Apache License 2.0
2.1k stars 121 forks source link

bug: `cortex` `run` or `pull` redownloads existing model multiple times #1344

Closed gabrielle-ong closed 1 month ago

gabrielle-ong commented 1 month ago

Cortex version

cortex run redownloads existing model multiple times

Describe the Bug

2 issues (see screenshot)

Steps to Reproduce

No response

Screenshots / Logs

image

What is your OS?

What engine are you running?

gabrielle-ong commented 1 month ago

image Also seen with cortex pull

gabrielle-ong commented 1 month ago

Still happening on v123 image

gabrielle-ong commented 1 month ago

Probably also linked to:

continue download? [Y/n] n -> I expect it not to download the model, But it retriggers download Image

nguyenhoangthuan99 commented 1 month ago

Related PR #1361

When chose not to re-download/continue download, disable log downloaded successfully

image

gabrielle-ong commented 1 month ago

@nguyenhoangthuan99 v-129: Redownload - No - is still triggering the download, I dont get Cancelled re-download! Image

nguyenhoangthuan99 commented 1 month ago

This logic I confirmed with @namchuai, with continue download feature, there are 3 options:

gabrielle-ong commented 1 month ago

@dan-homebrew - unexpected behaviour you encountered as well.

My inputs for consideration: As a user I would have expected n to stop the download process (eg dont want to use my limited data)

possibly 3 flags? (these are just semantics): [Y/n/restart]

namchuai commented 1 month ago

Usually, when I'm using CLI, if I want to stop foreground process, my go to is Ctrl C. However, I can't say for all users. Please confirm the way you found it's natural, and we will update it accordingly.

dan-homebrew commented 1 month ago

@namchuai @vansangpfiev I am re-opening this issue, as I think this is a Day 0 UX issue that we should resolve:

Current Problem

From the user's perspective, this is annoying:

image

What I was expecting was something like this:

> cortex-nightly run tinyllama
Searching local models... found `tinyllama:gguf`
Running `tinyllama:gguf`...
tinyllama:gguf started successfully

Proposed Solution

My goal is to simplify cortex run to minimize user input for the happy path:

Current

This is the current cortex run logic:

  1. cortex start
  2. models pull (if model is not existed) <- this logic
  3. engines install if engine is not existed
  4. models start

Improvement

I would like to expand on (1), i.e. model pull:

The "menu" should differentiate "Local" and "Available for Download":

> cortex-nightly run tinyllama
Local Models:
    1. gguf
    2. 1b-gguf

Available to Download:
    3. 7b-gguf
    4. 7b-tensorrt-llm

Select model to download: 1
namchuai commented 1 month ago

I think this make sense. I think we should apply for cortexso models first.

gabrielle-ong commented 1 month ago

Solved in #1418, marking as complete