Closed murphypei closed 2 years ago
Are all the test audios causing the same error ?
Could you help check the server side and show the log/screen output on server side? Not sure if decoder has been set up correctly.
Please also provide a test audio file if that is possible.
Are all the test audios causing the same error ?
Could you help check the server side and show the log/screen output on server side? Not sure if decoder has been set up correctly.
Please also provide a test audio file if that is possible.
Are all the test audios causing the same error ? Could you help check the server side and show the log/screen output on server side? Not sure if decoder has been set up correctly. Please also provide a test audio file if that is possible.
- yes
- emmm, I also think it's quite possible, so may you have a look of my server log, thanks. https://drive.google.com/file/d/1_arhDIkYAM3ZjND0ClhWuFRjWjMZFh9n/view?usp=sharing
Could you try to trace back the error as early as possible? Maybe you could print the inputs of self.batch_rescoring call, see if the score_hyps is empty or not.
Are all the test audios causing the same error ? Could you help check the server side and show the log/screen output on server side? Not sure if decoder has been set up correctly. Please also provide a test audio file if that is possible.
- yes
- emmm, I also think it's quite possible, so may you have a look of my server log, thanks. https://drive.google.com/file/d/1_arhDIkYAM3ZjND0ClhWuFRjWjMZFh9n/view?usp=sharing
Could you try to trace back the error as early as possible? Maybe you could print the inputs of self.batch_rescoring call, see if the score_hyps is empty or not.
thanks for you reply, score_hyps has data:
print code
print("=> rescore_hyps:\n", rescore_hyps)
print("=> rescore_encoder_hist:\n", rescore_encoder_hist)
print("=> rescore_encoder_lens:\n", rescore_encoder_lens)
print("=> max_length:\n", max_length)
best_index = self.batch_rescoring(rescore_hyps, rescore_encoder_hist,
rescore_encoder_lens, max_length)
Are all the test audios causing the same error ? Could you help check the server side and show the log/screen output on server side? Not sure if decoder has been set up correctly. Please also provide a test audio file if that is possible.
- yes
- emmm, I also think it's quite possible, so may you have a look of my server log, thanks. https://drive.google.com/file/d/1_arhDIkYAM3ZjND0ClhWuFRjWjMZFh9n/view?usp=sharing
Could you try to trace back the error as early as possible? Maybe you could print the inputs of self.batch_rescoring call, see if the score_hyps is empty or not.
thanks for you reply, score_hyps has data:
print code
print("=> rescore_hyps:\n", rescore_hyps) print("=> rescore_encoder_hist:\n", rescore_encoder_hist) print("=> rescore_encoder_lens:\n", rescore_encoder_lens) print("=> max_length:\n", max_length) best_index = self.batch_rescoring(rescore_hyps, rescore_encoder_hist, rescore_encoder_lens, max_length)
Thanks. Could you please give me a minimal essentials to reproduce the issue? Maybe your dockerfile , model_repo configs (if you modified any of them), include one test audio would be better.
Are all the test audios causing the same error ? Could you help check the server side and show the log/screen output on server side? Not sure if decoder has been set up correctly. Please also provide a test audio file if that is possible.
- yes
- emmm, I also think it's quite possible, so may you have a look of my server log, thanks. https://drive.google.com/file/d/1_arhDIkYAM3ZjND0ClhWuFRjWjMZFh9n/view?usp=sharing
Could you try to trace back the error as early as possible? Maybe you could print the inputs of self.batch_rescoring call, see if the score_hyps is empty or not.
thanks for you reply, score_hyps has data: print code
print("=> rescore_hyps:\n", rescore_hyps) print("=> rescore_encoder_hist:\n", rescore_encoder_hist) print("=> rescore_encoder_lens:\n", rescore_encoder_lens) print("=> max_length:\n", max_length) best_index = self.batch_rescoring(rescore_hyps, rescore_encoder_hist, rescore_encoder_lens, max_length)
Thanks. Could you please give me a minimal essentials to reproduce the issue? Maybe your dockerfile , model_repo configs (if you modified any of them), include one test audio would be better.
OK, I will prepare these data today, thank you.
@yuekaizhang I reproduce this problem by these steps:
FROM nvcr.io/nvidia/tritonserver:22.03-py3
LABEL maintainer="NVIDIA"
LABEL repository="tritonserver"
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb RUN dpkg -i cuda-keyring_1.0-1_all.deb
RUN apt-get update RUN apt-get install -fy cmake make swig RUN pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113 RUN pip3 install -v kaldifeat RUN pip3 install pyyaml onnx onnxruntime-gpu typeguard
RUN apt-get install -y libsndfile1 RUN pip3 install soundfile grpcio-tools tritonclient
WORKDIR /workspace RUN git clone https://github.com/Slyne/ctc_decoder.git && cd ctc_decoder/swig && bash setup.sh COPY ./scripts scripts
run docker container:
```sh
docker run --name wenet-triton --gpus '"device=3"' -it --shm-size=1g --ulimit memlock=-1 -v <wenet-dir>:/ws/wenet -v <20210618_u2pp_conformer_exp dir>:/ws/model wenet-tritonserver:22.03-py3 /bin/bash
the 20210618_u2pp_conformer_exp was downloaded from wenet github repo.
export model_dir=/ws/model
export PYTHONPATH=$PYTHONPATH:/ws/wenet
export onnx model
cd /ws/wenet/
python3 wenet/bin/export_onnx_gpu.py --config=$model_dir/train.yaml --checkpoint=$model_dir/final.pt --cmvn_file=$model_dir/global_cmvn --ctc_weight=0.3 --reverse_weight=0.3 --output_onnx_dir=$model_dir/onnx_gpu --streaming
convert config files
cd /ws/wenet/runtime/GPU
python3 scripts/convert.py --config=$model_dir/train.yaml --vocab=$model_dir/words.txt --model_repo=/ws/wenet/runtime/GPU/model_repo_stateful/ --onnx_model_dir=$model_dir/onnx_gpu
start server
cd /ws/wenet/runtime/GPU
tritonserver --model-repository=/ws/wenet/runtime/GPU/model_repo_stateful/ --pinned-memory-pool-byte-size=1024000000 --cuda-memory-pool-byte-size=0:1024000000
run client test in another terminal
cd /ws/wenet/runtime/GPU
python3 client.py --audio_file=chinese_test.wav --url=localhost:8001 --model_name=streaming_wenet --streaming
audio file: https://drive.google.com/file/d/1xsodix-CW9dDHZn0pFEnpVMSfyngFbh1/view?usp=sharing
If you encounter any problems during these operation, please contact me, thank you.
@yuekaizhang I reproduce this problem by these steps:
- build and run the docker file(both server and client in one container)
FROM nvcr.io/nvidia/tritonserver:22.03-py3 LABEL maintainer="NVIDIA" LABEL repository="tritonserver" # fix public key error RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb RUN dpkg -i cuda-keyring_1.0-1_all.deb RUN apt-get update RUN apt-get install -fy cmake make swig RUN pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113 RUN pip3 install -v kaldifeat RUN pip3 install pyyaml onnx onnxruntime-gpu typeguard # for client RUN apt-get install -y libsndfile1 RUN pip3 install soundfile grpcio-tools tritonclient WORKDIR /workspace RUN git clone https://github.com/Slyne/ctc_decoder.git && cd ctc_decoder/swig && bash setup.sh COPY ./scripts scripts
run docker container:
docker run --name wenet-triton --gpus '"device=3"' -it --shm-size=1g --ulimit memlock=-1 -v <wenet-dir>:/ws/wenet -v <20210618_u2pp_conformer_exp dir>:/ws/model wenet-tritonserver:22.03-py3 /bin/bash
the 20210618_u2pp_conformer_exp was downloaded from wenet github repo.
- into the container
export model_dir=/ws/model export PYTHONPATH=$PYTHONPATH:/ws/wenet
export onnx model
cd /ws/wenet/ python3 wenet/bin/export_onnx_gpu.py --config=$model_dir/train.yaml --checkpoint=$model_dir/final.pt --cmvn_file=$model_dir/global_cmvn --ctc_weight=0.3 --reverse_weight=0.3 --output_onnx_dir=$model_dir/onnx_gpu --streaming
convert config files
cd /ws/wenet/runtime/GPU python3 scripts/convert.py --config=$model_dir/train.yaml --vocab=$model_dir/words.txt --model_repo=/ws/wenet/runtime/GPU/model_repo_stateful/ --onnx_model_dir=$model_dir/onnx_gpu
start server
cd /ws/wenet/runtime/GPU tritonserver --model-repository=/ws/wenet/runtime/GPU/model_repo_stateful/ --pinned-memory-pool-byte-size=1024000000 --cuda-memory-pool-byte-size=0:1024000000
run client test in another terminal
cd /ws/wenet/runtime/GPU python3 client.py --audio_file=chinese_test.wav --url=localhost:8001 --model_name=streaming_wenet --streaming
audio file: https://drive.google.com/file/d/1xsodix-CW9dDHZn0pFEnpVMSfyngFbh1/view?usp=sharing
If you encounter any problems during these operation, please contact me, thank you.
Reproduced it, thx. I will debug it tomorrow. I recommend you use --fp16 when export to onnx , which works for me.
It should work now. #1314
Describe the bug Decoder onnx model output tensor is empty which cause dl_pack error. @Slyne
To Reproduce Just run streaming triton server as runtime/GPU README, no error occured.
client test:
python3 client.py --audio_file=test.wav --url=localhost:8001 --model_name=streaming_wenet --streaming
Expected behavior Correct output
Screenshots
Desktop (please complete the following information):