jozu-ai / kitops

Tools for easing the handoff between AI/ML and App/SRE teams.
https://KitOps.ml
Apache License 2.0
266 stars 26 forks source link

Add ability to reference other ModelKits in a Kitfile #260

Closed amisevsk closed 2 months ago

amisevsk commented 2 months ago

Description

This PR makes a number of changes:

  1. We're adding a field to the Kitfile, under model:

    model:
      name: <model-name>
      path: <path>
      parts: # New
        - name: readme
          path: my-readme.md
          description: My Readme file
        - <another part>

    The parts field is currently a list of structs that look similar to models. It can be used to e.g. store a license/readme alongside binary model data.

  2. Kitfiles now allow you to refer to other modelkits in the model's path field:

    model:
      name: <model-name>
      path: ghcr.io/jozu-ai/llama-2:7b-text-q4_0
      parts:
        - name: additional-file
          path: ./my-file.bin

    I've kept the field path even though it could be a model reference now for compatibility.

Testing

For testing purposes, I've packed and pushed two images: docker.io/amisevsk/artifact-test:base and docker.io/amisevsk/artifact-test:sub. The sub image references the base image in its Kitfile (see kit info --remote docker.io/amisevsk/artifact-test:sub)

I intend to add tests for packing+unpacking modelkits with references as above, hopefully soon.

Linked issues

Closes #85

amisevsk commented 2 months ago

I've rebased on main and added a few tests for the ModelKit references. For reviewers, commits c2f2652..b1145df are unchanged except for one squashed-in commit that fixes a bug I found while testing: https://github.com/amisevsk/kitops/commit/4541148b5e056efcc8b0e2df6977ba94a4b2bfc3

amisevsk commented 2 months ago

Cobra is giving me a lot of trouble with printing messages (hence the ugly workaround in 1db15f8). It seems like the only two "easy" options in cobra are

  1. Always print error/usage (default). This means that if e.g. we fail to contact a remote registry, we'll still print the usage message, effectively hiding the actual error

    ❯ kit pack -t test:test ./scratch/test-not-exists
    Configuration already exists in storage: sha256:4bad644891ff9a2c4f95e4a97202e71ed3762874f1bde0853a27f7ab50873019
    Failed to pack model kit: model path not-exist.txt does not exist  <-- this is the relevant message
    Error: failed to run
    
    Usage:
      kit pack [flags] DIRECTORY
    <dozens of lines of help text>...

    This is useless for us since a command failing with error means the user used the CLI correctly and we hit an error somewhere else. Telling them how to use the CLI is annoying.

  2. Never print error/usage (SilenceUsage: true, SilenceErrors: true). This means that you can run kit asdf and it will output nothing. Obviously not good.

Since we have to use RunE functions (calling os.Exit(1) in tests will just exit the test suite), our options are

  1. Set SilenceUsage and SilenceErrors inside the command after it starts running, or
  2. Wrap commands so that they don't use RunE:
    Run: func(cmd *cobra.Command, args []string) {
      if err := runCommand(opts)(cmd, args); err != nil {
        os.Exit(1)
      }
    }

I've chosen the first option here.