Closed warlock135 closed 3 years ago
Can you try to use the GRPC client's streaming API to send the request? Otherwise you need to make sure the requests arrives to the server in order
Sorry for the late reply, I was out of office last week
My client use grpc api.
Here is code fragment use to create/ get grpc client:
std::vector<std::unique_ptr<tc::InferenceServerGrpcClient>> contextes_;
contextes_.emplace_back();
std::unique_ptr<tc::InferenceServerGrpcClient>& client = contextes_.back();
FAIL_IF_ERR(
tc::InferenceServerGrpcClient::Create(&(client), url_, false),
"unable to create grpc client");
std::unique_ptr<tc::InferenceServerGrpcClient>& context =
contextes_.at(corr_id % ncontextes_);
@warlock135 How are you sending the inference request? Are you using AsyncStreamInfer as Guan suggested? The example is here.
@tanmayv25 After moving from AsyncInfer to AsyncStreamInfer the code works well. Thank you for the support. I will close the issue.
Description I try to develop Kaldi-asr backend/ client for new triton version (21.07) based on the old one (here)
Sometimes, requests fail with the following error:
Turning on the server verbose log, I figure out that the request with START flag was processed AFTER the one with END flag.
Client requests are sent in the right order:
Log printed from code above:
Triton Information What version of Triton are you using? r21.07
Are you using the Triton container or did you build it yourself? container
To Reproduce Steps to reproduce the behavior.
Triton server launches command :
Expected behavior Requests are processed in the order they were sent