State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.48k
stars
3.21k
forks
source link
scripts/docker/launch_client.sh hangs in TritonASRClient constructor after outputting "Opening GRPC contextes..." #1023
Related to Integrating NVIDIA Triton Inference Server with Kaldi ASR
Describe the bug
Upon following the instructions outlined in https://developer.nvidia.com/blog/integrating-nvidia-triton-inference-server-with-kaldi-asr/ we are able to successfully launch the Triton server with DeepLearningExamples/Kaldi/SpeechRecognition/scripts/docker/launch_server.sh, however when launching the client via DeepLearningExamples/Kaldi/SpeechRecognition/scripts/docker/launch_client.sh, the client outputs "Opening GRPC contextes..." and then hangs, without outputting "done" or "Streaming utterances...".
The server starts up without a problem, loads the kaldi online model, and outputs "Starting Metrics Service at 0.0.0.0:8002".
However the client hangs after outputting line, "Opening GRPC contextes...". We never see the "done" for that step, nor do we see what appears to be the next expected output of "Streaming utterances..."
It appears as though the client is hanging at kaldi-asr-client/kaldi_asr_parallel_client.cc line 273 calling the TritonASRClient constructor.
Expected behavior
The client sends 1000 parallel streams to the server, printing the inferred text sent back from the server.
Environment
Container version:
nvcr.io/nvidia/tritonserver:21.05-py3
nvcr.io/nvidia/tritonserver:21.05-py3-sdk
triton_kaldi_client:latest
vars set in Dockerfile.client:
ARG TRITONSERVER_IMAGE=nvcr.io/nvidia/tritonserver:21.05-py3
ARG KALDI_IMAGE=nvcr.io/nvidia/kaldi:21.08-py3
ARG PYTHON_VER=3.8
GPUs in the system: 1x Nvidia A10 Tensor GPU
CUDA driver version: NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4
Related to Integrating NVIDIA Triton Inference Server with Kaldi ASR
Describe the bug Upon following the instructions outlined in https://developer.nvidia.com/blog/integrating-nvidia-triton-inference-server-with-kaldi-asr/ we are able to successfully launch the Triton server with DeepLearningExamples/Kaldi/SpeechRecognition/scripts/docker/launch_server.sh, however when launching the client via DeepLearningExamples/Kaldi/SpeechRecognition/scripts/docker/launch_client.sh, the client outputs "Opening GRPC contextes..." and then hangs, without outputting "done" or "Streaming utterances...".
To Reproduce Steps to reproduce the behavior: Follow the steps outlined in https://developer.nvidia.com/blog/integrating-nvidia-triton-inference-server-with-kaldi-asr/, specifically:
The server starts up without a problem, loads the kaldi online model, and outputs "Starting Metrics Service at 0.0.0.0:8002".
However the client hangs after outputting line, "Opening GRPC contextes...". We never see the "done" for that step, nor do we see what appears to be the next expected output of "Streaming utterances..."
It appears as though the client is hanging at kaldi-asr-client/kaldi_asr_parallel_client.cc line 273 calling the TritonASRClient constructor.
Expected behavior The client sends 1000 parallel streams to the server, printing the inferred text sent back from the server.
Environment