containers / podman-desktop-extension-ai-lab

Work with LLMs on a local environment using containers
https://podman-desktop.io/extensions/ai-lab
Apache License 2.0
182 stars 39 forks source link

`ai-lab.yaml` format #1672

Open axel7083 opened 2 months ago

axel7083 commented 2 months ago

Description

The format of the ai-lab.yaml is in my opinion problematic as we do not respect the declaration of the user for the inference server.

Since https://github.com/containers/podman-desktop-extension-ai-lab/issues/1431 we tried to formalized more this file, however with the GPU support for recipes requirement we lose some of the logic, as we overwrite user declaration.

Here is an example from the chatbot recipe^1

  containers:
    - name: llamacpp-server
      contextdir: ../../../model_servers/llamacpp_python
      containerfile: ./base/Containerfile
      model-service: true
      backend:
        - llama
      arch:
        - arm64
        - amd64
      ports:
        - 8001
      image: quay.io/ai-lab/llamacppp_python:latest

Since we have the model-service: true, the Recipe Manager will use this information and will ignore all the others, and uses the backend property from the catalog to determine which Inference Provider should be used (⚠️ we are not using the backend property from the ai-lab.yaml).

Let's say a user want to import a custom recipe and write the following

  containers:
    - name: llamacpp-server
      model-service: true
      backend:
        - llama-cpp
    - name: whispercpp-server
      model-service: true
      backend:
        - whisper-cpp
    - name: my-app
      image: quay.io/my-app:latest

What is the expected output from the user ? Having two inference server and the user application running. What is the reel result ?

=> Nothing, because we would need to define the recipe inside the catalog, and only the backend property from the catalog would be used.

This is also a problem because, if a user is doing

  containers:
    - name: llamacpp-server
      model-service: true
      contextdir: ../../../model_servers/llamacpp_python
      containerfile: ./base/Containerfile
      backend:
        - llama-cpp

We would just ignore the contextdir and containerfile to put our default image from the Inference server.


How to improve this format ?

Open to discussion