sky-uk / kfp-operator

https://sky-uk.github.io/kfp-operator
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

Implement NATs event Trigger (Step 3) #358

Closed grahamia closed 2 weeks ago

grahamia commented 2 months ago

Overview

Currently the webhook call from KFP Operator is a http call to an Argo events webhook.

This issue is to introduce a new component "NATs event trigger" with a gRPC interface, this service will process the request and produce the run completion message onto the eventbus.

Technical Details

Deployment needs to be added to helm and kustomize. gRPC interface will need to contain all the same information as is currently in the http payload First task will be to defined the protobuf file for the service to include the following:

        "specversion": "1.0",
        "id": "123",
        "source": "vai",
        "type": "org.kubeflow.pipelines.run-completion",
        "datacontenttype": "application/json",
        "data": {
            "pipelineName": "pipelinename",
            "provider": "vai",
            "runConfigurationName": "runcomfigname",
            "runId": "runIdFromVAI",
            "runName": "runname",
            "servingModelArtifacts": [
                {
                    "location": "some_location",
                    "name": "pushed_model"
                }
            ],
            "status": "succeeded"
        }

Message put onto NATs eventbus should be same as current message. Location of NATs eventbus should be configurable.

Current Architecture

image

After Architecture

image

Acceptance Criteria

Eventing feedback should function as currently with no change Testing should be carried out to ensure that same retry logic on failure exists and at least once consistency is maintained.