kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.61k stars 1.63k forks source link

issue with google_cloud_pipeline_components 0.1.9 ModelUploadOp ("invalid value") #6848

Closed amygdala closed 4 months ago

amygdala commented 3 years ago

with the v0.1.9 of ModelUploadOp I'm seeing the following error. I confirmed that things work fine in 0.1.7.

PORT is set as follows:

PORT = 8080

Then, in the pipeline definition, ModelUploadOp is configured as follows:

    model_upload_op = gcc_aip.ModelUploadOp(
        project=project,
        display_name=model_display_name,
        serving_container_image_uri=build_image_task.outputs['serving_container_uri'],
        serving_container_predict_route="/predictions/{}".format(MAR_MODEL_NAME),
        serving_container_health_route="/ping",
        serving_container_ports=[PORT]

This is the error I see (again, I do NOT see this problem with v0.1.7):

2021-11-01 13:01:17.976 PDTRuntimeError: Failed to create the resource. Error: {'code': 400, 'message': "Invalid value at 'model.container_spec.ports[0]' (type.googleapis.com/google.cloud.aiplatform.v1.Port), 8080", 'status': 'INVALID_ARGUMENT', 'details': [{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'model.container_spec.ports[0]', 'description': "Invalid value at 'model.container_spec.ports[0]' (type.googleapis.com/google.cloud.aiplatform.v1.Port), 8080"}]}]}      

I wondered if something has changed about the expected format of this arg, but from the documentation (serving_container_ports (Optional[Sequence[int]]=None) it doesn't seem so. Again, this code runs fine with 0.1.7.

amygdala commented 3 years ago

Here's what the set of input args looks like in the console: https://screenshot.googleplex.com/HdYMJLiz8oVhxjB /cc @SinaChavoshi @IronPan

jagadeeshi2i commented 3 years ago

@amygdala try configuring port as below - serving_container_ports=[{"containerPort" : PORT}]. The component follows k8s v1/core spec. Worked for me with above config.

amygdala commented 3 years ago

Interesting -- the api spec says: serving_container_ports (Optional[Sequence[int]]=None): but maybe the docs need to be updated.

LeonardoEssence commented 2 years ago

We spent hours trying to find an answer to this. We need better alignment between the documentation in here https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-0.2.0/google_cloud_pipeline_components.aiplatform.html and here https://cloud.google.com/vertex-ai/docs/predictions/use-custom-container

andodet commented 2 years ago

Stumbled upon this issue after more time I'd like to publicly admit. Passing the port argument as @jagadeeshi2i mentioned did the trick for me.

I am now struggling to set serving_container_environment_variables as I think it might be suffering the same problem. I am currently passing env vars to the container in the following way (docs lists it as a Optional[Dict[str, str]]):

model_upload_op = gcc_aip.ModelUploadOp(
    project="and-reporting",
    location="us-west1",
    display_name="session_model",
    serving_container_image_uri="gcr.io/and-reporting/pred:latest",
    # The following is creating troubles...
    serving_container_environment_variables={"MODEL_BUCKET": "ml_session_model/model},
    serving_container_ports=[{"containerPort": 5000}],
    serving_container_predict_route="/predict",
    serving_container_health_route="/health",
)

Which produces the following error:

RuntimeError: Failed to create the resource. Error: {'code': 400, 'message': 'Invalid JSON payload received. Unknown name "MODEL_BUCKET" at \'model.container_spec.env[0]\': Cannot find field.', 'status': 'INVALID_ARGUMENT', 'details': [{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'model.container_spec.env[0]', 'description': 'Invalid JSON payload received. Unknown name "MODEL_BUCKET" at \'model.container_spec.env[0]\': Cannot find field.'}]}]}

I think ModelUploadOp might have been left out from the refresh in #5481.

Solution

Can confirm serving_container_environment_variables documentation is outdated. Setting it accordingly to the kubernetes docs solved it for me:

serving_container_environment_variables=[
    {"name": "MODEL_BUCKET", "value": "ml_session_model/model"}
],
github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 4 months ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.