Open sumitbinnani opened 3 years ago
You can set the python model first input dim as -1,such [-1, 224, 224, 3] and it can use batch prediction.
@sumitbinnani I have filed a ticket for this enhancement.
@CPFelix The current batch prediction snippet for python backend with dynamic batching will look like this:
responses = []
for request in requests:
...
response = ...
responses.append(response)
return responses
This does not use parallelization or vector operations.
With the approach suggested by you, it will not be possible to have dynamic batching.
@Tabrizian Is there any progress on this issue?
I also faced with this problem. I have a python backend for a pytorch model. If I use dynamic batching, then several requests come to the execute function. Why aren't they stacked in one batch?
@TsykunovDmitriy There is not any update on this. But we have ticket for this feature request.
Is your feature request related to a problem? Please describe. I have a python function that can process multiple requests in parallel (i.e. support batched prediction). However, I have to perform inference per request in a for loop leading to massive slowdown.
Describe the solution you'd like Parse inference requests in execute of python backend as a batched numpy array. e.g. if config has input of dim 3 and dtype as int, and i receive 8 requests to execute, need some way to parse this request as 8x3 array. Similarly, need way to output 8x1 array as a single output by the backend.
Describe alternatives you've considered Couldn't think of anything else.