Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
134
stars
72
forks
source link
Batch Inference does not work when using the default handler #121
In batch inference, the model-server (in this case, torchserve) will return a 'batch' i.e list of requests to the handler. The handler is expected to process them and send back the responses. This would be a list of 'batch-size' responses.
Currently, the pt toolkit uses the transform() function from the base inference-toolkit to receive requests from the model server, and process them by calling the _transform_fn() i.e [which calls _input_fn, _predict_fn, _output_fn].
Describe the bug
To reproduce A clear, step-by-step set of instructions to reproduce the bug:
Expected behavior
Screenshots or logs
System information A description of your system. Please provide:
Additional context Add any other context about the problem here.