aai-institute / jobq

https://aai-institute.github.io/jobq/latest
Apache License 2.0
2 stars 1 forks source link

Server-side container image builds #43

Open AdrianoKF opened 3 months ago

AdrianoKF commented 3 months ago

As a data scientist / ML engineer, I do not want to go through the overhead of downloading all sorts of dependencies and uploading them back to the internet as a container image, before I can sub my job for execution (since I am on a limited-bandwidth connection).

Instead, I want these container images to be built on a build server, according to the image specification given as part of my job metadata.

This approach yields additional benefits:


High-level Design

Overview

Server-side

Client-side

Additional Considerations

Security

Correctness

Open Questions

Client-based Orchestration

sequenceDiagram
    participant Job API
    actor CLI as Jobq User
    participant Build API
    participant Background Tasks
    participant Builder

    CLI->>Build API: POST /build
    activate CLI
    activate Build API

    Build API -) Background Tasks: Trigger image build
    Background Tasks ->> Builder: Build image
    activate Builder
    Build API-->>CLI: HTTP 202, build_id, image_ref
    deactivate Build API

    par
        loop status != "completed"
        CLI ->> Build API: GET /builds/<build_id>
        activate Build API
        Build API -->> CLI: Build status
        deactivate Build API
        end
    and
        Builder -->> Background Tasks: image_ref
        deactivate Builder
        Note right of Background Tasks: Publish image here?
    end

    CLI->>Job API: POST /jobs (image_ref)
    activate Job API
    Job API -->> CLI: HTTP 200, job_id
    deactivate Job API

    deactivate CLI
AdrianoKF commented 2 months ago

I've hacked together a Trie-based path matching class that can be used to validate Gitignore (or Dockerignore) patterns against a file path: https://gist.github.com/AdrianoKF/d5bf77f200592c2cab2b8633b85f8a97

This can serve as the starting point when building an archive of the build context locally for upload to the server.