Closed rishin27 closed 1 year ago
Hi @rishin27 - BentoML internally uses async everywhere for better performance, so this might be an internal issue with BentoML on Yatai, that's not related to user code. May I ask what's the BentoML and Yatai version that you are using?
Note that when a Bento is deployed on Yatai, Runners are by default scheduled as their own separate pods, in order to scale separately from the service code. The async code path is used for converting the runner.run
function call into an async RPC call. This might be an issue with how the deployment and runner communication was set up.
Hi @parano Thanks for the info. I get that yatai is build for autoscaling & kubernetes workload magic. But if the user has nowhere marked their code as async, is it right to send it by default to the async code path ? The general workflow for a data scientist will be that they'll build the model, save it using bento, check if the service is working using 'bentoml serve'... if everything works out will push it to the yatai server for the deployment. But then if yatai adds async magic (which was not intended) and thing start breaking, it'll not be a good UX.
Yatai - v0.3.1-c3dab74 bentoml - v1.0.0a7
Thanks @rishin27, we will look into this issue more. Ideally by design, if the bentoml serve
works locally, it should definitely work on Yatai. Note that Yatai is still in its alpha release so definitely expect some rough edges at the moment.
But if the user has nowhere marked their code as async, is it right to send it by default to the async code path ?
This is actually fairly common in python web frameworks, such as Sanic or FastAPI, where both sync
and async
handlers can be defined by the user, the framework uses async internally. BentoML server uses async even without Yatai. I think the root cause of the issue is likely not about async, but some settings with the distributed runner setup in yatai deployment.
Thanks @parano for your detailed answer. Do let me know if i can help, happy to contribute.
Hi @rishin27, could you try it again with the latest version of BentoML and Yatai? The issue should be resolved by now.
Hi Team,
I'm trying to serve my models using the yatai server. But when i try to do inference on the endpoint it errors out with the following
API response - "An error has occurred in BentoML user code when handling this request, find the error details in server logs"
Container Logs -
/home/bentoml/bento/src/service.py:23 in predict │
If you see i have nowhere used any async functionality in the bento service code, but still the bento build on the yatai server fails with the above "async - ServerDisconnectedError: Server disconnected" errors.
If i simply run the 'bentoml containerize' command and then do a docker run, the service works without any errors.
Am i missing something ? Please impart some wisdom 🙏
Thanks