GoogleCloudPlatform / kubeflow-distribution

Blueprints for Deploying Kubeflow on Google Cloud Platform and Anthos
Apache License 2.0
80 stars 63 forks source link

Kubeflow Node Selector #439

Closed kellibelcher closed 1 year ago

kellibelcher commented 1 year ago

Hello, I am trying to use the node selector from the kfp-kubernetes API in my Pipeline. However, when I run the Python script below and upload the Pipeline to Kubeflow, the node selector portion, after sdkVersion: kfp-2.0.1 in the yaml below, gets removed on Kubeflow. Do you know what is causing this? Is there another way I can select the nodes for the Pipeline tasks with the KFP SDK v2?

sdkVersion: kfp-2.0.1
---
platforms:
  kubernetes:
    deploymentSpec:
      executors:
        exec-load-data:
          nodeSelector:
            labels:
              cloud.google.com/gke-nodepool: default-pool

Here is the Python code that I used to generate the Pipeline:

from kfp import dsl
from kfp import compiler
from kfp import kubernetes
from kfp.dsl import Output, Dataset

@dsl.component(
        base_image="python:3.10", 
        packages_to_install=["pandas", "loguru"])
def load_data(
    data_url: str, 
    credit_risk_dataset: Output[Dataset]):

    import pandas as pd
    from loguru import logger

    logger.info("Loading csv from {}", data_url)
    data = pd.read_csv(data_url)
    data.to_csv(credit_risk_dataset.path, index = None)

@dsl.pipeline
def intel_xgboost_daal4py_pipeline(
    data_url: str):

    load_data_op = load_data(data_url = data_url)

    kubernetes.add_node_selector(
        task = load_data_op, label_key = 'cloud.google.com/gke-nodepool', label_value = 'default-pool')

if __name__ == '__main__':
    compiler.Compiler().compile(
        pipeline_func = intel_xgboost_daal4py_pipeline, 
        package_path = 'intel-xgboost-daal4py-pipeline-gcp.yaml')
Linchin commented 1 year ago

Hi @kellibelcher, thank you for reporting this issue! Would you let me know which version of Kubeflow GCP distribution you are using? The latest release, v1.7.1, uses KFP v2.0.0-alpha.7, which does not support node selector.

kellibelcher commented 1 year ago

Hi @Linchin, ah, that is the version I was using. Do you know if the node selector will be supported in the future?

chensun commented 1 year ago

@kellibelcher, we are working on releasing Kubeflow 1.8 (ETA: end of Sep) which would support node selector.

kellibelcher commented 1 year ago

Okay, great. Thank you @chensun