kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.61k stars 1.63k forks source link

[sdk] KFP v2 pipeline with error #8682

Closed TrevorM15 closed 1 year ago

TrevorM15 commented 1 year ago

Environment

Steps to reproduce

Tried upgrading from KFP 1.8 to 2.0 for higher version kubernetes sdk support. Tried copying the example from here

import kfp
from kfp import compiler
from kfp import dsl

@dsl.component
def addition_component(num1: int, num2: int) -> int:
  return num1 + num2

@dsl.pipeline(name='addition-pipeline')
def my_pipeline(a: int=1, b: int=2, c: int = 10):
  add_task_1 = addition_component(num1=a, num2=b)
  add_task_2 = addition_component(num1=add_task_1.output, num2=c)

cmplr = compiler.Compiler()
cmplr.compile(my_pipeline, package_path='my_pipeline.yaml')

client=kfp.Client()
client.create_run_from_pipeline_package('my_pipeline.yaml',arguments={"a":1,"b":2})

When I give it arguments I get the error ApiException: (400) Reason: Bad Request HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Mon, 16 Jan 2023 18:20:24 GMT', 'content-length': '190', 'x-envoy-upstream-service-time': '0', 'server': 'envoy'}) HTTP response body: {"error":"json: cannot unmarshal number into Go value of type map[string]json.RawMessage","code":3,"message":"json: cannot unmarshal number into Go value of type map[string]json.RawMessage"}

And without arguments I get the error ApiException: (400) Reason: Bad Request HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Mon, 16 Jan 2023 18:30:38 GMT', 'content-length': '548', 'x-envoy-upstream-service-time': '1', 'server': 'envoy'}) HTTP response body: {"error":"Validate create run request failed.: InvalidInputError: Invalid IR spec format.: invalid character 'c' looking for beginning of value","code":3,"message":"Validate create run request failed.: InvalidInputError: Invalid IR spec format.: invalid character 'c' looking for beginning of value","details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid IR spec format.","error_details":"Validate create run request failed.: InvalidInputError: Invalid IR spec format.: invalid character 'c' looking for beginning of value"}]}

Expected result

Pipeline compiles and runs.

Impacted by this bug? Give it a 👍.

kimkihoon0515 commented 1 year ago

The Error comes out because you failed to access to kfp client. Try like this

import requests

USERNAME = "user@example.com"
PASSWORD = "12341234" 
NAMESPACE = "kubeflow-user-example-com"
HOST = "http://127.0.0.1:8080" # your istio-ingressgateway pod ip:8080

session = requests.Session()
response = session.get(HOST)

headers = {
    "Content-Type": "application/x-www-form-urlencoded",
}

data = {"login": "user@example.com", "password": "12341234"}
session.post(response.url, headers=headers, data=data)
session_cookie = session.cookies.get_dict()["authservice_session"]

client = kfp.Client(
    host=f"{HOST}/pipeline",
    namespace=f"{NAMESPACE}",
    cookies=f"authservice_session={session_cookie}",
)
TrevorM15 commented 1 year ago

We have our namespace set up to not require any arguments in the Client method. My pipeline worked in kfp v1.8, but I needed a newer version of the SDK to get a newer Kubernetes client version.

kimkihoon0515 commented 1 year ago

oic did you solve the error by upgrading SDK version?

TrevorM15 commented 1 year ago

No, upgrading to kfp v2.0.0 is what's causing the issue. It appears from the diffs between 1.8.18 and 2.0.0b10 that all the changes were in the SDK, not in the kfp backend, so not sure what's causing these issues.

kimkihoon0515 commented 1 year ago

well for me it works fine

image
connor-mccarthy commented 1 year ago

@kimkihoon0515 @TrevorM15, if you pip install kfp==2.0.0b9 kfp-pipeline-spec==0.1.16 (and pin these versions in your SDK environment) is the error resolved?

kimkihoon0515 commented 1 year ago

@connor-mccarthy yes. There's no error on kfp 2.0 beta.

TrevorM15 commented 1 year ago

@connor-mccarthy please ignore @kimkihoon0515, he does not speak for me. I did pip install kfp==2.0.0b9 kfp-pipeline-spec==0.1.16, but still got the error Reason: Bad Request HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Fri, 20 Jan 2023 15:46:05 GMT', 'content-length': '548', 'x-envoy-upstream-service-time': '1', 'server': 'envoy'}) HTTP response body: {"error":"Validate create run request failed.: InvalidInputError: Invalid IR spec format.: invalid character 'c' looking for beginning of value","code":3,"message":"Validate create run request failed.: InvalidInputError: Invalid IR spec format.: invalid character 'c' looking for beginning of value","details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid IR spec format.","error_details":"Validate create run request failed.: InvalidInputError: Invalid IR spec format.: invalid character 'c' looking for beginning of value"}]} after compiling and attempting to create a run from the pipeline.

connor-mccarthy commented 1 year ago

@TrevorM15, I was able to reproduce this with kfp>=2.0.0b10, but it resolved when I downgraded to kfp==2.0.0b9. This is because kfp==2.0.0b10 (this release was yanked today, replaced by b11) began writing the isOptional field with https://github.com/kubeflow/pipelines/pull/8612 and https://github.com/kubeflow/pipelines/pull/8623 (release notes).

Can you double check that this error is not resolved when you use 2.0.0b9? This isOptional field should not be present in your compiled YAML. If you still have the error, can you please run pip freeze and share the output here?

connor-mccarthy commented 1 year ago

Edit: This is the solution for #8734, not this bug.

In the short term, this is resolved by downgrading to kfp==2.0.0b9 (and recompiling your pipeline with this version). In the long term, this will be resolved for kfp>=2.0.0b10 by the upcoming KFP BE beta release.

cc @chensun @gkcalat @Linchin

TrevorM15 commented 1 year ago

@connor-mccarthy the issue still persists for me when downgrading to 2.0.0b9

connor-mccarthy commented 1 year ago

@TrevorM15, can you confirm that the isOptional field is not present in your compiled YAML? Can you run pip freeze and share the output?

TrevorM15 commented 1 year ago

pip freeze output:

absl-py==0.11.0
adal==1.2.7
anyio==3.1.0
argon2-cffi==20.1.0
async-generator==1.10
attrs==21.2.0
avro==1.11.0
azure-common==1.1.28
azure-storage-blob==2.1.0
azure-storage-common==2.1.0
Babel==2.9.1
backcall==0.2.0
bleach==3.3.0
blis==0.7.6
bokeh==2.3.2
brotlipy==0.7.0
cachetools==4.2.4
catalogue==2.0.6
certifi==2021.5.30
cffi @ file:///home/conda/feedstock_root/build_artifacts/cffi_1613413861439/work
chardet @ file:///home/conda/feedstock_root/build_artifacts/chardet_1610093490430/work
click==7.1.2
cloudevents==1.2.0
cloudpickle==2.2.1
colorama==0.4.4
conda==4.10.1
conda-package-handling @ file:///home/conda/feedstock_root/build_artifacts/conda-package-handling_1618231394280/work
configparser==5.2.0
cryptography @ file:///home/conda/feedstock_root/build_artifacts/cryptography_1616851476134/work
cycler==0.11.0
cymem==2.0.6
decorator==5.0.9
defusedxml==0.7.1
Deprecated==1.2.13
deprecation==2.1.0
dill==0.3.4
docstring-parser==0.13
entrypoints==0.3
fastai==2.4
fastcore==1.3.29
fastprogress==1.0.2
fire==0.4.0
gitdb==4.0.9
GitPython==3.1.27
google-api-core==2.7.1
google-api-python-client==1.12.10
google-auth==1.35.0
google-auth-httplib2==0.1.0
google-cloud-core==2.3.2
google-cloud-storage==2.7.0
google-crc32c==1.3.0
google-resumable-media==2.3.2
googleapis-common-protos==1.55.0
httplib2==0.20.4
idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1593328102638/work
imageio==2.16.1
ipykernel==5.5.5
ipympl==0.7.0
ipython==7.24.1
ipython-genutils==0.2.0
ipywidgets==7.6.3
jedi==0.18.0
Jinja2==3.0.1
joblib==1.1.0
json5==0.9.5
jsonschema==3.2.0
jupyter-client==6.1.12
jupyter-core==4.7.1
jupyter-server==1.8.0
jupyter-server-mathjax==0.2.5
jupyterlab==3.0.16
jupyterlab-git==0.30.1
jupyterlab-pygments==0.1.2
jupyterlab-server==2.6.0
jupyterlab-widgets==1.0.2
kfp==2.0.0b9
kfp-pipeline-spec==0.1.16
kfp-server-api==2.0.0a6
kfserving==0.5.1
kiwisolver==1.3.2
kubernetes==12.0.1
langcodes==3.3.0
MarkupSafe==2.0.1
matplotlib==3.4.2
matplotlib-inline==0.1.2
minio==6.0.2
mistune==0.8.4
murmurhash==1.0.6
nbclassic==0.3.1
nbclient==0.5.3
nbconvert==6.0.7
nbdime==3.1.1
nbformat==5.1.3
nest-asyncio==1.5.1
networkx==2.7.1
notebook==6.4.0
numpy==1.20.3
oauthlib==3.2.0
packaging==20.9
pandas==1.2.4
pandocfilters==1.4.3
parso==0.8.2
pathy==0.6.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.0.1
preshed==3.0.6
prometheus-client==0.11.0
prompt-toolkit==3.0.18
protobuf==3.19.4
ptyprocess==0.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycosat @ file:///home/conda/feedstock_root/build_artifacts/pycosat_1610094800877/work
pycparser @ file:///home/conda/feedstock_root/build_artifacts/pycparser_1593275161868/work
pydantic==1.8.2
Pygments==2.9.0
PyJWT==2.3.0
pyOpenSSL @ file:///home/conda/feedstock_root/build_artifacts/pyopenssl_1608055815057/work
pyparsing==2.4.7
pyrsistent==0.17.3
PySocks @ file:///home/conda/feedstock_root/build_artifacts/pysocks_1610291447907/work
python-dateutil==2.8.1
pytz==2021.1
PyWavelets==1.2.0
PyYAML==5.4.1
pyzmq==22.1.0
requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1608156231189/work
requests-oauthlib==1.3.1
requests-toolbelt==0.9.1
rsa==4.8
ruamel-yaml-conda @ file:///home/conda/feedstock_root/build_artifacts/ruamel_yaml_1611943339799/work
scikit-image==0.18.1
scikit-learn==0.24.2
scipy==1.7.0
seaborn==0.11.1
Send2Trash==1.5.0
six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work
smart-open==5.2.1
smmap==5.0.0
sniffio==1.2.0
spacy==3.2.3
spacy-legacy==3.0.9
spacy-loggers==1.0.1
srsly==2.4.2
strip-hints==0.1.10
table-logger==0.3.6
tabulate==0.8.9
termcolor==1.1.0
terminado==0.10.0
testpath==0.5.0
thinc==8.0.13
threadpoolctl==3.1.0
tifffile==2022.2.9
torch==1.8.1+cpu
torchaudio==0.8.1
torchvision==0.9.1+cpu
tornado==6.1
tqdm @ file:///home/conda/feedstock_root/build_artifacts/tqdm_1621890532941/work
traitlets==5.0.5
typer==0.4.0
typing-extensions==3.10.0.0
uritemplate==3.0.1
urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1622056799390/work
wasabi==0.9.0
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.0.1
widgetsnbextension==3.5.2
wrapt==1.13.3
xgboost==1.4.2

The yaml:

# PIPELINE DEFINITION
# Name: addition-pipeline
# Inputs:
#    a: int [Default: 1.0]
#    b: int [Default: 2.0]
#    c: int [Default: 10.0]
components:
  comp-addition-component:
    executorLabel: exec-addition-component
    inputDefinitions:
      parameters:
        num1:
          parameterType: NUMBER_INTEGER
        num2:
          parameterType: NUMBER_INTEGER
    outputDefinitions:
      parameters:
        Output:
          parameterType: NUMBER_INTEGER
  comp-addition-component-2:
    executorLabel: exec-addition-component-2
    inputDefinitions:
      parameters:
        num1:
          parameterType: NUMBER_INTEGER
        num2:
          parameterType: NUMBER_INTEGER
    outputDefinitions:
      parameters:
        Output:
          parameterType: NUMBER_INTEGER
deploymentSpec:
  executors:
    exec-addition-component:
      container:
        args:
        - --executor_input
        - '{{$}}'
        - --function_to_execute
        - addition_component
        command:
        - sh
        - -c
        - "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip ||\
          \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
          \ python3 -m pip install --quiet     --no-warn-script-location 'kfp==2.0.0-beta.9'\
          \ && \"$0\" \"$@\"\n"
        - sh
        - -ec
        - 'program_path=$(mktemp -d)

          printf "%s" "$0" > "$program_path/ephemeral_component.py"

          python3 -m kfp.components.executor_main                         --component_module_path                         "$program_path/ephemeral_component.py"                         "$@"

          '
        - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
          \ *\n\ndef addition_component(num1: int, num2: int) -> int:\n  return num1\
          \ + num2\n\n"
        image: python:3.7
    exec-addition-component-2:
      container:
        args:
        - --executor_input
        - '{{$}}'
        - --function_to_execute
        - addition_component
        command:
        - sh
        - -c
        - "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip ||\
          \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
          \ python3 -m pip install --quiet     --no-warn-script-location 'kfp==2.0.0-beta.9'\
          \ && \"$0\" \"$@\"\n"
        - sh
        - -ec
        - 'program_path=$(mktemp -d)

          printf "%s" "$0" > "$program_path/ephemeral_component.py"

          python3 -m kfp.components.executor_main                         --component_module_path                         "$program_path/ephemeral_component.py"                         "$@"

          '
        - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
          \ *\n\ndef addition_component(num1: int, num2: int) -> int:\n  return num1\
          \ + num2\n\n"
        image: python:3.7
pipelineInfo:
  name: addition-pipeline
root:
  dag:
    tasks:
      addition-component:
        cachingOptions:
          enableCache: true
        componentRef:
          name: comp-addition-component
        inputs:
          parameters:
            num1:
              componentInputParameter: a
            num2:
              componentInputParameter: b
        taskInfo:
          name: addition-component
      addition-component-2:
        cachingOptions:
          enableCache: true
        componentRef:
          name: comp-addition-component-2
        dependentTasks:
        - addition-component
        inputs:
          parameters:
            num1:
              taskOutputParameter:
                outputParameterKey: Output
                producerTask: addition-component
            num2:
              componentInputParameter: c
        taskInfo:
          name: addition-component-2
  inputDefinitions:
    parameters:
      a:
        defaultValue: 1.0
        parameterType: NUMBER_INTEGER
      b:
        defaultValue: 2.0
        parameterType: NUMBER_INTEGER
      c:
        defaultValue: 10.0
        parameterType: NUMBER_INTEGER
schemaVersion: 2.1.0
sdkVersion: kfp-2.0.0-beta.9
connor-mccarthy commented 1 year ago

I am unable to reproduce this on later versions of the KFP BE. 1.5 is a fairly old version of the KFP BE. I suggest you upgrade the BE and see if the issue continues.

You can find some upgrade instructions here (specific to Google Cloud).

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.