triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
7.73k stars 1.42k forks source link

README.md steps not working for me #4829

Closed jesuino closed 1 year ago

jesuino commented 1 year ago

Description In README in "Serve a Model in 3 Easy Steps" section the step 3 fails for me with the following output:

root@localhost:/workspace# /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
expecting input to have 3 dimensions, model 'densenet_onnx' input has 4

From wireshark I can see that the request was not even sent, looks like it validates the request using a GET request to the model configuration.

I also tried with the other sample model, but looks like the image is not sent. This is the result I have:

root@localhost:/workspace# /workspace/install/bin/image_client -m inception_graphdef -c 3 -s INCEPTION /workspace/images/mug.jpg
expecting 1 input, got 0

Using wireshark I got the JSON sent, see [2]

Triton Information

The same from README

To Reproduce

Follow the steps from README

Expected behavior Have the inference result

[1]

{
   "name":"inception_graphdef",
   "platform":"tensorflow_graphdef",
   "backend":"tensorflow",
   "version_policy":{
      "latest":{
         "num_versions":1
      }
   },
   "max_batch_size":0,
   "input":[

   ],
   "output":[

   ],
   "batch_input":[

   ],
   "batch_output":[

   ],
   "optimization":{
      "priority":"PRIORITY_DEFAULT",
      "input_pinned_memory":{
         "enable":true
      },
      "output_pinned_memory":{
         "enable":true
      },
      "gather_kernel_buffer_threshold":0,
      "eager_batching":false
   },
   "instance_group":[
      {
         "name":"inception_graphdef",
         "kind":"KIND_CPU",
         "count":2,
         "gpus":[

         ],
         "secondary_devices":[

         ],
         "profile":[

         ],
         "passive":false,
         "host_policy":""
      }
   ],
   "default_model_filename":"model.graphdef",
   "cc_model_filenames":{

   },
   "metric_tags":{

   },
   "parameters":{

   },
   "model_warmup":[

   ]
}
krishung5 commented 1 year ago

Hi @jesuino, I went through all the steps described in the README "Serve a Model in 3 Easy Steps" section and was not able to reproduce this issue. From the JSON you got from wireshark, it seems like the model configuration is different from the one we provide here. Could you confirm that the config files for the models are correct? Could you also provide the full server log by running server with flag --log-verbose=1?

rnwang04 commented 1 year ago

Hi @krishung5 , I meet the same problem here. the step 3 fails for me with the following output: image And Triton server seems working but receive no request : image I wonder how to fix this problem. thanks!

krishung5 commented 1 year ago

Hi @rnwang04, what model are you using and how do you send the request? Are you using our own client with the same command /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg in the example?

Besides, could you confirm that the models are loaded successfully? If so, you should see something like this in the server log:

I0927 18:18:49.411099 1 server.cc:629] 
+----------------------+---------+--------+
| Model                | Version | Status |
+----------------------+---------+--------+
| densenet_onnx        | 1       | READY  |
| inception_graphdef   | 1       | READY  |
| simple               | 1       | READY  |
| simple_dyna_sequence | 1       | READY  |
| simple_identity      | 1       | READY  |
| simple_int8          | 1       | READY  |
| simple_sequence      | 1       | READY  |
| simple_string        | 1       | READY  |
+----------------------+---------+--------+
rnwang04 commented 1 year ago

hi @krishung5 I just follow Serve a Model in 3 Easy Steps and run following commands:

# Step 1: Create the example model repository 
git clone -b r22.07 https://github.com/triton-inference-server/server.git
cd server/docs/examples
./fetch_models.sh

# Step 2: Launch triton from the NGC Triton container
sudo docker run --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:22.07-py3 tritonserver --model-repository=/models

# Step 3: Sending an Inference Request 
# In a separate console, launch the image_client example from the NGC Triton SDK container
sudo docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:22.07-py3-sdk
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg

and I confirm that the models are loaded successfully as I obtain following output: image Besides, I have verified Triton Is Running Correctly image But I still got the above error.

krishung5 commented 1 year ago

Thanks for confirming, @rnwang04. I followed the exact same commands you shared but still could not reproduce this issue. Filed a ticket for the team to investigate this further.

dyastremsky commented 1 year ago

Closing this issue and related ticket due to inactivity. The README has also been updated a few times. Please let us know if you are still seeing this issue.