kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.62k stars 1.63k forks source link

feat(SDK): Add SemaphoreKey and MutexName fields to DSL #11340

Open DharmitD opened 3 weeks ago

DharmitD commented 3 weeks ago

Description of your changes: This PR introduces semaphoreKey and mutexName fields in PipelineConfig to support pipeline-level concurrency controls in KFP SDK.

This PR should be merged only after https://github.com/kubeflow/pipelines/pull/11384 gets merged.

Testing instructions

Create a Python virtualenv and install the SDK and IR YAML API packages locally:

$ python -m venv .venv
$ source .venv/bin/activate
$ pip install wheel setuptools protobuf grpcio grpcio-tools
$ pip install -r sdk/python/requirements-dev.txt
$ pip install -e api/v2alpha1/python
$ pip install -e sdk/python

Use the example code to compile

$ kfp dsl compile --py main.py --output main.yaml

You should be able to compile and find the following snippet in the main.yaml file:

---
platforms:
  kubernetes:
    pipelineConfig:
      mutexName: mutex
      semaphoreKey: semaphore

Checklist:

gregsheremeta commented 3 weeks ago

couple nitpicks but overall lgtm.

I don't want to merge it until the backend PR is posted and close to shipping.

HumairAK commented 3 weeks ago

can we add a test where we verify the semaphore setting compiles to the expected IR?

DharmitD commented 1 week ago

Edit: Updated to have the DSL generated SemaphoreKey instead of SemaphoreName field, since having a fixed, hardcoded semaphore name and letting users define the semaphore key is the format prescribed in Argo Workflows' synchronization docs here.

gregsheremeta commented 1 week ago

can we add a test where we verify the semaphore setting compiles to the expected IR?

still need to do this. Here's an example: https://github.com/kubeflow/pipelines/pull/10913/files#diff-cac76397fcf6a375f3865a864a3f6725b023c69b5216fa1ed71a86b334641399

DharmitD commented 1 week ago

can we add a test where we verify the semaphore setting compiles to the expected IR?

still need to do this. Here's an example: https://github.com/kubeflow/pipelines/pull/10913/files#diff-cac76397fcf6a375f3865a864a3f6725b023c69b5216fa1ed71a86b334641399

Yes I'm working on this ATM. I'm adding a unit test to the compiler_test.py similar to the platform tests here: https://github.com/kubeflow/pipelines/blob/8a7b0e5bae5338de7430f4c7d407d594ebc3bd95/sdk/python/kfp/compiler/compiler_test.py#L3489

Have a draft ready, testing locally and pushing it in a bit.

DharmitD commented 1 week ago

can we add a test where we verify the semaphore setting compiles to the expected IR?

still need to do this. Here's an example: https://github.com/kubeflow/pipelines/pull/10913/files#diff-cac76397fcf6a375f3865a864a3f6725b023c69b5216fa1ed71a86b334641399

Yes I'm working on this ATM. I'm adding a unit test to the compiler_test.py similar to the platform tests here:

https://github.com/kubeflow/pipelines/blob/8a7b0e5bae5338de7430f4c7d407d594ebc3bd95/sdk/python/kfp/compiler/compiler_test.py#L3489

Have a draft ready, testing locally and pushing it in a bit.

Added a unit test to verify semaphore and mutex setting, the test works as expected and passes on my local venv, but fails in this PR's CI.

Wondering if it's because the .proto changes haven't been merged into the code base yet. @gregsheremeta could you take a look and confirm? And if it is so, then should the test be added in a follow up PR?

gregsheremeta commented 1 week ago

ml_pipelines.PipelineConfig" has no field named "semaphoreKey" at "SinglePlatformSpec.pipelineConfig".

So yes. The test can stay here. Tests just won't pass until the proto is merged and this is rebased. It's the normal/usual/expected flow.

google-oss-prow[bot] commented 1 week ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign connor-mccarthy for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[sdk/OWNERS](https://github.com/kubeflow/pipelines/blob/master/sdk/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
DharmitD commented 1 week ago

/hold until https://github.com/kubeflow/pipelines/pull/11384 gets merged.

DharmitD commented 1 week ago

ml_pipelines.PipelineConfig" has no field named "semaphoreKey" at "SinglePlatformSpec.pipelineConfig".

So yes. The test can stay here. Tests just won't pass until the proto is merged and this is rebased. It's the normal/usual/expected flow.

Edit: Removed proto/API changes and created a separate PR for those here: https://github.com/kubeflow/pipelines/pull/11384 So as @gregsheremeta mentioned, once proto changes are merged and this PR is rebased, the test will work.