triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.27k stars 1.47k forks source link

Poll failed for model directory 'diabetes_model': Invalid model name: Could not determine backend for model 'diabetes_model' with no backend in model configuration. Expected model name of the form 'model.<backend_name>' #7336

Closed Manishthakur2503 closed 4 months ago

Manishthakur2503 commented 4 months ago

Description I have created onnx model, & deploying on triton server

Triton Information 24.05-py3

Are you using the Triton container or did you build it yourself? Using the triton container

To Reproduce This is my config.pbtxt which is in this directory "C:\Projects\ML\NvidiaTriton\model_repository\diabetes_model" which is name: "diabetes_model" backend : "onnxruntime" max_batch_size: 0 input [ { name: "float_input" data_type: TYPE_FP32 dims: [ 8 ] } ] output [ { name: "output" data_type: TYPE_FP32 dims: [ 1 ] } ]

and my model.onnx is inside this directory "C:\Projects\ML\NvidiaTriton\model_repository\diabetes_model\1"

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). The curl command I am using to train data is curl --location 'http://127.0.0.1:5001/train' \ --form 'train_file=@"/C:/Projects/ML/diabetes.csv"' \ --form 'tol="0.98"' \ --form 'iterations="50"' \ --form 'exp_name="Diabetes_Experiment"' and the csv file headers are as Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome

The command I am using is : docker run --rm --name tritonserver -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /mnt/host/c/Projects/ML/NvidiaTriton/model_repository:/models nvcr.io/nvidia/tritonserver:23.01-py3 tritonserver --model-repository=/models

Full error:

docker run --rm --name tritonserver -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /mnt/host/c/Projects/ML/NvidiaTriton/model_repository:/models nvcr.io/nvidia/tritonserver:23.01-py3 tritonserver --model-repository=/models

============================= == Triton Inference Server ==

NVIDIA Release 23.01 (build 52277748) Triton Server Version 2.30.0

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .

W0610 10:10:12.935475 1 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version I0610 10:10:12.937832 1 cuda_memory_manager.cc:115] CUDA memory pool disabled E0610 10:10:12.942047 1 model_repository_manager.cc:1004] Poll failed for model directory '1': Invalid model name: Could not determine backend for model '1' with no backend in model configuration. Expected model name of the form 'model.'. E0610 10:10:12.942107 1 model_repository_manager.cc:1004] Poll failed for model directory 'diabetes_model': Invalid model name: Could not determine backend for model 'diabetes_model' with no backend in model configuration. Expected model name of the form 'model.'. I0610 10:10:12.942142 1 server.cc:563] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+

I0610 10:10:12.942153 1 server.cc:590] +---------+------+--------+ | Backend | Path | Config | +---------+------+--------+ +---------+------+--------+

I0610 10:10:12.942164 1 server.cc:633] +-------+---------+--------+ | Model | Version | Status | +-------+---------+--------+ +-------+---------+--------+

I0610 10:10:12.942399 1 metrics.cc:757] Collecting CPU metrics I0610 10:10:12.943057 1 tritonserver.cc:2264] +----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Option Value
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ server_id triton
server_version 2.30.0
server_extensions classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging
model_repository_path[0] /models
model_control_mode MODE_NONE
strict_model_config 0
rate_limit OFF
pinned_memory_pool_byte_size 268435456
response_cache_byte_size 0
min_supported_compute_capability 6.0
strict_readiness 1
exit_timeout 30

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0610 10:10:12.943089 1 server.cc:264] Waiting for in-flight requests to complete. I0610 10:10:12.943094 1 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences I0610 10:10:12.943097 1 server.cc:295] All models are stopped, unloading models I0610 10:10:12.943101 1 server.cc:302] Timeout 30: Found 0 live models and 0 in-flight non-inference requests error: creating server: Internal - failed to load all models

Manishthakur2503 commented 4 months ago

I tried to print basic information about the model using : print(onnx.helper.printable_graph(onnx_model.graph)) and got this :graph model-onnx ( %float_input[FLOAT, ?x8] ) { %label, %probabilities = LinearClassifierclasslabels_ints = [0, 1], coefficients = [-1.15231204032898, -3.61196088790894, 0.323476761579514, -0.234493508934975, -0.0497973449528217, -2.38965225219727, -1.12923967838287, -0.840260863304138, 1.15231204032898, 3.61196088790894, -0.323476761579514, 0.234493508934975, 0.0497973449528217, 2.38965225219727, 1.12923967838287, 0.840260863304138], intercepts = [4.81376791000366, -4.81376791000366], multi_class = 0, post_transform = 'LOGISTIC'
%output_label = Castto = 7 %output_probability = ZipMapclasslabels_int64s = [0, 1] return %output_label, %output_probability }.

So is there anything I am missing in the config file?

indrajit96 commented 4 months ago

Hello @Manishthakur2503 thanks for reaching out, this looks like a model repository error. Is your model repository accessible inside the container? Is the structure of your model repository?

/ / config.pbtxt / model.onnx Before you do docker run --rm --name tritonserver -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /mnt/host/c/Projects/ML/NvidiaTriton/model_repository:/models nvcr.io/nvidia/tritonserver:23.01-py3 tritonserver --model-repository=/models Can you check if /models is as expected in the container? Look at this tutorial https://github.com/triton-inference-server/server/blob/main/docs/getting_started/quickstart.md This too load a onnx model the way you are trying to. Thanks, Indrajit
Manishthakur2503 commented 4 months ago

Hi @indrajit96

Thanks for your response.

I checked the model repository structure, and it was correct. However, the issue was with the input dimensions in the config file, which was causing the error. I have fixed it now, and everything is working fine.

Thanks, Manish Thakur