Closed NaxAlpha closed 3 years ago
Firstly really sorry for the late reply. The issue seems to be a configuration error in the API Gateway. This only happens when you passing data as a multipart/form-data
type but if you pass it in a binary format it will work.
I'm working on adding support for multipart/form-data
too but in the meantime, you can refer https://stackoverflow.com/a/56132015 or https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-console.html to get an idea of how to solve it on your end.
Again really sorry for the late reply but I hope this helps : )
Thanks a lot for the update. Any timeline for the fix? and is it related to this repo or bentoml/aws-sagemaker-deploy?
BTW I also did try the binary method like this:
curl -i \
-X POST \
--header "Content-Type:application/octet-stream" \
--data-binary @data/mobile-sample.png \
https://123.execute-api.region.amazonaws.com/prod/predict
But I get this error:
which input handler are you using? also can you post your bentoservice too
Here is the sample service I am using:
class Model(bentoml.BentoService):
@bentoml.api(input=ImageInput(), batch=False)
def predict(self, image):
img = np.array(image)
...
Here is the sample service I am using:
class Model(bentoml.BentoService): @bentoml.api(input=ImageInput(), batch=False) def predict(self, image): img = np.array(image) ...
I have met similar problem. My understanding is that you will need to use FileInput() as input adapter.
But but but...
I tried to use FileInput() and code like below unfortunately still can't get it working on AWS Sagemaker + API Gateway (Local Running the API by using bentoml server Model:latest works well without any problems)
def predict(self, file_streams: List[BinaryIO]) -> List[str]:
print('start to process')
for fs in file_streams:
image_pil = Image.open(fs)
image_numpy = np.array(image_pil)
...
I did choose to change settings on API gateway to passthrough all binary input but still getting error information like:
AttributeError: 'FileLike' object has no attribute 'readline'
[2021-09-12 20:19:27,814] ERROR - Error caught in API function:
Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/bentoml/service/inference_api.py",
line 176, in wrapped_func return self._user_func(*args, **kwargs) File "/bento/ImageClassifier/Image_classifier.py",
line 83, in predict image_pil = Image.open(fs) File "/opt/conda/lib/python3.8/site-packages/PIL/Image.py",
line 2944, in open im = _open_core(fp, filename, prefix, formats) File "/opt/conda/lib/python3.8/site-packages/PIL/Image.py",
line 2930, in _open_core im = factory(fp, filename) File "/opt/conda/lib/python3.8/site-packages/PIL/ImageFile.py",
line 121, in __init__ self._open() File "/opt/conda/lib/python3.8/site-packages/PIL/ImImagePlugin.py",
line 153, in _open s = s + self.fp.readline()
I suspect AWS API gateway does something weird to binary data(maybe reformat it) and then pass to Sagemaker because obviously this line "image_pil = Image.open(fs)" failed with fs not being a valid binaryIO handler
@jjmachan Plz advise ty!
Firstly really sorry for the late reply. The issue seems to be a configuration error in the API Gateway. This only happens when you passing data as a
multipart/form-data
type but if you pass it in a binary format it will work.I'm working on adding support for
multipart/form-data
too but in the meantime, you can refer https://stackoverflow.com/a/56132015 or https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-console.html to get an idea of how to solve it on your end.Again really sorry for the late reply but I hope this helps : )
Thanks @jjmachan , after following this, I was able to get it working. For folks who has problems still, plz make sure to click 'deploy' the API gateway manually to make all changes effective. See screenshot below
Thanks @cliu0507 for adding this step too 🙌🏽
hey, @NaxAlpha @cliu0507 We have added another method of dealing with ImageInputHandler and form-data and also adds support for multiple endpoints too. It would be really awesome if you can take a look at that and see if it solves these issues and also your feedback is appreciated too.
Just tested it for form/multipart
and it seems to be working overall except for one thing which is that after redeploying, the first call gives this error (Whether I call the API 1 minute after deployment or 10 minutes):
{
"message": "Internal Server Error"
}
But right after this first call, if I call again, then it works perfectly. Also, I still could not test the ImageInput
because of the GPU issue.
that is strange, can you get the logs from the API Gateway so that we can get a better idea may be as to what is happening. You can refer to this article https://docs.aws.amazon.com/apigateway/latest/developerguide/http-api-troubleshooting-lambda.html since the API gateway logs are not set up by default.
Also, I guess there are no logs from the sagemakers cloudwatch logs and this is a problem with the API Gateway?
yeah when the error message comes, no logs are visible neither in Lambda nor in Endpoint
then it some issue with the API Gateway, it would be really helpful if we can get the logs for the API gateway since I'm not able to reproduce the issue locally.
This is what I got for the first two requests where I got internal server error after enabling gateway logs:
Furthermore, it looks like the gateway IS calling the SageMaker and there are corresponding logs for every failed request in both Lambda and SageMaker but due to maybe bootstrapping request is failing after 3 seconds in lambda:
Thanks a lot for getting the logs, the issue was the 3 seconds lambda timeout and now I've patched that to be the same as the timeout
config option we have (so a healthy 60sec by default). Thanks so much for bringing trying it out and bring up the issue, wouldn't have figured it out otherwise 😄
ok just tried the multipart
and imageinput
services after this update. Both are working 👍
Describe the bug
After deploying the dev endpoint (ref: bentoml/aws-sagemaker-deploy#13), I cannot get the response with this:
Looking at couldwatch: logs show this error:
To Reproduce
Expected behavior
Running the API by using
bentoml server Model:latest
works but is not reproducible for the deployed API on SageMaker.Screenshots/Logs
Environment:
bentoml/model-server:0.13.1-py38-gpu
Additional context