googleapis / python-aiplatform

A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
Apache License 2.0
634 stars 345 forks source link

`grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC` #2177

Open gomrinal opened 1 year ago

gomrinal commented 1 year ago

I am trying to use this code to submit a training job to vertex ai endpoint. However, I am seeing some errors as mentioned in the title when there are 200 pipelines sending message every second.

job = pipeline_jobs.PipelineJob(
              display_name="some display"),
              template_path="pipeline.json",
              job_id="{}-{}".format(pipeline_name,TIMESTAMP)
              )

 job.submit(service_account=service_account)
matthew29tang commented 1 year ago

Could you provide your pipeline.json file? The error is pretty generic/vague and can occur for a myriad of reasons. If you have additional context for this, it also be helpful to try to debug this.

gomrinal commented 1 year ago

pipeline.json is compiled pipeline for simple hello world. I faced this issue when I tried to load test the system where cloud function v2 consumes messages from the pub/sub topic.

Cloud function triggers the vertex ai pipeline training job!

matthew29tang commented 1 year ago

The grpc._channel._InactiveRpcError error is often caused by a more descriptive error (ex. in this issue it is caused by a 500 internal error). Can you paste the exception causing the grpc._channel._InactiveRpcError?

gomrinal commented 1 year ago
Traceback (most recent call last):
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 72, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/grpc/_channel.py", line 1030, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/grpc/_channel.py", line 910, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
matthew29tang commented 1 year ago

The error that you pasted seems to have been truncated, can you paste the part of the error (and anything after that) below the line grpc._channel._InactiveRpcError:?