Closed parano closed 2 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Beep boop! 🤖 This issue hasn't had any activity in a while. I'll close it if I don't hear back soon.
We will revisit this issue after 1.0 release, which will make supporting micro-batching in SageMaker deployment possible
@parano @jjmachan Not sure if it is the right place to ask:
Wanna confirm my understanding is correct: Did you mean current aws sagemaker deployment through bentoML can NOT support adaptive micro batching mechanism yet? (batch size always equals 1 ?)
Also, for AWS lambda deployment, does it the same limitation?
Thanks!
hey @cliu0507 sadly you are right, the current version of sagemaker deployment and lambda deployment does not support micro batching mechanism. But we are looking to add support for this with the bentoml 1.0 release 🤞🏽
Beep boop! 🤖 This issue hasn't had any activity in a while. I'll close it if I don't hear back soon.
Beep boop! 🤖 This issue hasn't had any activity in a while. I'll close it if I don't hear back soon.
implemented
Is your feature request related to a problem? Please describe.
Currently, BentoML deployment on SageMaker does not utilize the micro-batching capability: https://github.com/bentoml/BentoML/blob/master/bentoml/deployment/sagemaker/sagemaker_serve.py
Describe the solution you'd like Needs investigation
Describe alternatives you've considered SageMaker expects a WSGI app and we may need to add a WSGI wrapper to the bentoML Marshal Server.
Additional context n/a