Open lswjkllc opened 7 months ago
Hi @lswjkllc Please try this example
https://github.com/pytorch/serve/blob/master/kubernetes/kserve/examples/mnist/MNIST.md
Hi @lswjkllc,
While it might not be directly related to your 503 error but since you mention kserve 0.11, is it 0.11.0 or 0.11.1 ? If the former, you should add:
env:
- name: PROTOCOL_VERSION
value: v2
to your predictor definition to ensure v2 is used to serve your model.
🐛 Describe the bug
This case is not working :https://kserve.github.io/website/0.11/modelserving/v1beta1/torchserve/#deploy-pytorch-model-with-v2-rest-protocol. The isvc object is ready when using v2 protocal with new scheme:
But, the response always return error when sending a http request for inference:
The request:
Before request, I use 'kubectl port-forward' expose the service:
mnist_v2_bytes.json:
Error logs
2024-02-27 14:45:39.492 37650 root INFO [timing():48] kserve.io.kserve.protocol.rest.v1_endpoints.predict 18.291365146636963, ['http_status:500', 'http_method:POST', 'time:wall'] 2024-02-27 14:45:39.493 37650 root INFO [timing():48] kserve.io.kserve.protocol.rest.v1_endpoints.predict 0.35211900000000007, ['http_status:500', 'http_method:POST', 'time:cpu'] 2024-02-27 14:45:39.494 37650 uvicorn.error ERROR [run_asgi():376] Exception in ASGI application Traceback (most recent call last): File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 373, in run_asgi result = await app(self.scope, self.receive, self.send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in call return await self.app(scope, receive, send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/fastapi/applications.py", line 270, in call await super().call(scope, receive, send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/applications.py", line 124, in call await self.middleware_stack(scope, receive, send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/timing_asgi/middleware.py", line 70, in call await self.app(scope, receive, send_wrapper) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call raise e File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call await self.app(scope, receive, send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/routing.py", line 706, in call await route.handle(scope, receive, send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/starlette/routing.py", line 66, in app response = await func(request) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/fastapi/routing.py", line 235, in app raw_response = await run_endpoint_function( File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/fastapi/routing.py", line 161, in run_endpoint_function return await dependant.call(**values) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/kserve/protocol/rest/v1_endpoints.py", line 69, in predict response, response_headers = await self.dataplane.infer(model_name=model_name, body=body, headers=headers) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/kserve/protocol/dataplane.py", line 276, in infer response = await model(body, headers=headers) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/kserve/model.py", line 116, in call response = (await self.predict(payload, headers)) if inspect.iscoroutinefunction(self.predict) \ File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/kserve/model.py", line 319, in predict return await self._http_predict(payload, headers) File "/Users/kust/Workspace/projects/bocloud/torchserve/.py38/lib/python3.8/site-packages/kserve/model.py", line 286, in _http_predict raise HTTPStatusError(message, request=response.request, response=response) httpx.HTTPStatusError: {'code': 503, 'type': 'InternalServerException', 'message': 'Prediction failed'}, '503 Service Unavailable' for url 'http://0.0.0.0:8085/v1/models/mnist:predict'
Installation instructions
KServe Version: 0.11 Kubernetes version: 1.23.0 OS (e.g. from /etc/os-release): centos 7.9
Model Packaing
gs://kfserving-examples/models/torchserve/image_classifier/v2
config.properties
No response
Versions
unknown
Repro instructions
unknown
Possible Solution
Expected Output: