About automatic Batch - Githubissues

For the TensoRT backend, if my input dimension is [-1,-1, 7], the maximum batch is 2 Now I start a service with tritonserver

When I make a request, the dimension of one request is [1,4, 7], the dimension of the other request is [1,6, 7], This kind of situation can not be Batch, can only be handled one by one. batch is automatically formed only when the last two dimensions are the same

At this time, I only change the last two dimensions to the same in the preprocessing, so that they can Batch infer

triton-inference-server / server

About automatic Batch #7738