triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.03k stars 1.44k forks source link

failed to start Vertex AI service Invalid argument #6767

Closed niraj06 closed 8 months ago

niraj06 commented 8 months ago

Description I am trying to deploy a model to the vertex ai endpoint, when it's tested locally it works well. it's unclear if it's an issue from vertex ai or triton. Any sort of help is much appreciated!

Triton Information 23.12-pyt-python-py3, we faced the same issue with 23.10+-pyt-python-py3

Are you using the Triton container or did you build it yourself? using Triton container

To Reproduce Deploying the triton inference to vertex ai endpoint.

Here is the command run locally docker run -t -p 8000:8000 --rm --name=our_model_triton_server -e AIP_MODE=True artifact-registry/triton-2312/tritonserver:latest --model-repository=gs://project-id/bucket/path/to/model_artifacts_repository --log-verbose 1

Error logs while deploying to vertex ai endpoint I0104 21:30:38.681034 1 python_be.cc:2384] TRITONBACKEND_ModelInstanceInitialize: instance initialization successful nba_model_0_0 (device 0) I0104 21:30:38.681204 1 backend_model_instance.cc:772] Starting backend thread for our_model_0_0 at nice 0 on device 0... I0104 21:30:38.681680 1 model_lifecycle.cc:818] successfully loaded 'our_model' I0104 21:30:38.681905 1 server.cc:606] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+

I0104 21:30:38.682178 1 server.cc:633] +---------+---------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Backend | Path | Config | +---------+---------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ | pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} | | python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} | +---------+---------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0104 21:30:38.682272 1 server.cc:676] +---------------------------+---------+--------+ | Model | Version | Status | +---------------------------+---------+--------+ | 0_transformworkflowtriton | 1 | READY | | 1_predictpytorchtriton | 1 | READY | | our_model | 1 | READY | +---------------------------+---------+--------+

I0104 21:30:38.682520 1 metrics.cc:710] Collecting CPU metrics I0104 21:30:38.682761 1 tritonserver.cc:2483] +----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.41.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging | | model_repository_path[0] | gs://ingka-ce-pco-ml-dev/models/next-best-action-markets-rand/297762669741/next-best-action-pipeline-20240104114638/train_9150680574363435008/ensemble_artifacts_repository | | model_control_mode | MODE_NONE | | strict_model_config | 0 | | rate_limit | OFF | | pinned_memory_pool_byte_size | 268435456 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | | cache_enabled | 0 | +----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

ERROR 2024-01-04T22:12:28.091251850Z {"levelname":"ERROR", "logTag":"F"} ERROR 2024-01-04T22:12:28.091282367Z I0104 22:12:28.091031 1 api.cc:381] Using credential for path gs://ingka-ce-pco-ml-dev/models/next-best-action-markets-rand/297762669741/next-best-action-pipeline-20240104114638/train_9150680574363435008/ensemble_artifacts_repository ERROR 2024-01-04T22:12:28.348005294Z I0104 22:12:28.347735 1 model_lifecycle.cc:265] ModelStates() ERROR 2024-01-04T22:12:28.405848979Z E0104 22:12:28.347854 1 main.cc:282] failed to start Vertex AI service: Invalid argument - Expect the model repository contains only a single model if default model is not specified

niraj06 commented 8 months ago

fix with by passing this arugments --allow-http=True --allow-vertex-ai=True