feat: implement pull model for API

containers / podman-desktop-extension-ai-lab

Work with LLMs on a local environment using containers

https://podman-desktop.io/extensions/ai-lab

Apache License 2.0

170 stars 31 forks source link

feat: implement pull model for API #1626

Closed feloy closed 3 weeks ago

feloy commented 1 month ago

What does this PR do?

Adds a /api/pull endpoint to the API to download models

Screenshot / video of UI

https://github.com/user-attachments/assets/bcc8fb65-cec6-4c79-b85d-6b55875d9f83

What issues does this PR fix or reference?

Fixes #1583

How to test this PR?

Start AI Lab extension

$ OLLAMA_HOST=localhost:10434 ollama pull facebook/detr-resnet-101
// should download model
$ OLLAMA_HOST=localhost:10434 ollama pull facebook/detr-resnet-101
// should indicate success and not reload model
$ OLLAMA_HOST=localhost:10434 ollama pull unknown-model
// should fail with error message

feloy commented 1 month ago

Seems to me that we are always working in streaming mode ?

You mean that all responses of the /api/pull are JSON streams in this implementation?

It also seems to be the case for ollama implementation, for example:

$ curl -X POST localhost:10343/api/pull -d '{"model": "unknown"}'
{"status":"pulling manifest"}
{"error":"pull model manifest: file does not exist"}

jeffmaury commented 1 month ago

Seems to me that we are always working in streaming mode ?

You mean that all responses of the /api/pull are JSON streams in this implementation?

It also seems to be the case for ollama implementation, for example:
$ curl -X POST localhost:10343/api/pull -d '{"model": "unknown"}'
{"status":"pulling manifest"}
{"error":"pull model manifest: file does not exist"}

Yes but look at the documentation if the stream parameter is false then a single json object is returned:

curl -X POST localhost:11434/api/pull -d '{"model": "unknown", "stream": false}'
{"error":"pull model manifest: file does not exist"}

https://github.com/ollama/ollama/blob/main/docs/api.md#pull-a-model

feloy commented 1 month ago

Seems to me that we are always working in streaming mode ?

You mean that all responses of the /api/pull are JSON streams in this implementation? It also seems to be the case for ollama implementation, for example:
$ curl -X POST localhost:10343/api/pull -d '{"model": "unknown"}'
{"status":"pulling manifest"}
{"error":"pull model manifest: file does not exist"}
Yes but look at the documentation if the stream parameter is false then a single json object is returned:
curl -X POST localhost:11434/api/pull -d '{"model": "unknown", "stream": false}'
{"error":"pull model manifest: file does not exist"}
https://github.com/ollama/ollama/blob/main/docs/api.md#pull-a-model

Effectively, I added it to the swagger spec, but forgot in the meantime. I'll implement it

feloy commented 3 weeks ago

LGTM

Only remark is that I found it strange that 200 is returned in case of error in non streaming mode

Where? I can only see 500 codes when returning errors, in non-streaming mode

jeffmaury commented 3 weeks ago

LGTM Only remark is that I found it strange that 200 is returned in case of error in non streaming mode

Where? I can only see 500 codes when returning errors, in non-streaming mode

Sorry that was in streaming mode