project-codeflare / multi-cluster-app-dispatcher

Holistic job manager on Kubernetes
Apache License 2.0
109 stars 63 forks source link

More generic completion status reconciliation #637

Open astefanutti opened 1 year ago

astefanutti commented 1 year ago

Name of Feature or Improvement

Improve the current mechanism that reconciles the completion status of an AppWrapper, so it supports more kind of workloads.

Description of Problem the Feature Should Solve

While the user can specify how the AppWrapper completion status is reconciled from the underlying workload, this is limited to the workload APIs that advertise a fixed completion condition in their status.

For example, it is currently not possible to reconcile the status of a RayJob, as Kuberay advertises its status in the .status.jobStatus field.

Describe the Solution You Would Like to See

The .spec.completionStatus field currently expects the name of the completion condition to look for in the underlying workload.

The behaviour of that field could be changed so it takes a JSONPath expression, e.g., .status.jobStatus == 'STOPPED' || .status.jobStatus == 'FAILED' || .status.jobStatus == 'SUCCEEDED', or a CEL based expression.

butler54 commented 1 year ago

+1 on this one. The description in the AppWrapper schema is also inconsistent with the actual behaviour / type information:

The completionstatus field contains a list of conditions that make the associate item considered completed