aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
468 stars 154 forks source link

Deploying a compiled model via endpoint. #276

Closed minhtcai closed 3 years ago

minhtcai commented 3 years ago

Hi Neuron team.

  1. I was trying to compile a YOLOv4 Pytorch model for inf1.2xlarge by using Neo but it failed. When I check here, it seems that Neo doesn't support Object Detection right? https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-cloud.html

  2. I was able to compile YOLOv4 Pytorch in a inf1.2xlarge instance. So in instead of using the compiled weights from Neo, I add compiled model from inferentia instance to tar.gz file and create a model and deploy to an endpoint using SagerMaker. But I got this error: botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "Content type application/x-image is not supported by this framework.

Question: a. is the weights (pt) compiled via Neo different to the weights compiled in Inferentia instance? (assume that the both target to inf1, just different environment?) b. is that possible to use weights compiled in inferentia instance to deploy as an endpoint in aws sagermaker? c. do you guys have any idea with the error in (2). This is the request I used: https://docs.aws.amazon.com/sagemaker/latest/dg/neo-requests-cli.html

jeffhataws commented 3 years ago

Let us take a look and get back to you. Thanks.

jeffhataws commented 3 years ago

Hi minhtcai,

Apologies on the belated response. We have worked with the Neo team to provide you with the following answers:

Question a) The weights compiled via Neo are the same as the weights compiled on Inferentia instance (using Neuron SDK) if the source model and input shapes are the same and both compilations target inf1.

Question b) It should be possible, but it is not the intended use case.

Question c) This type of error generally means that the container isn't expecting that type of input metadata, so either the code passed to the container should be changed or the type of the input data passed to the container should be updated.

mrnikwaws commented 3 years ago

Hi minhtcai,

Since there has been no activity on this issue for 15 days and we believe the question is answered I am closing it. Please re-open the issue if you have more information or questions.