ModelError when calling the InvokeEndpoint

Mak-24 commented 3 years ago

While running the deploy_torchserve.ipynb without editing anything, I encountered an error at block 11.

Error:

ModelError Traceback (most recent call last)

in 5 payload = payload 6 ----> 7 response = predictor.predict(data=payload) 8 print(*json.loads(response), sep = '\n') ~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id) 134 data, initial_args, target_model, target_variant, inference_id 135 ) --> 136 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args) 137 return self._handle_response(response) 138 ~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 355 "%s() only accepts keyword arguments." % py_operation_name) 356 # The "self" in this scope is referring to the BaseClient. --> 357 return self._make_api_call(operation_name, kwargs) 358 359 _api_call.__name__ = str(py_operation_name) ~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 674 error_code = parsed_response.get("Error", {}).get("Code") 675 error_class = self.exceptions.from_code(error_code) --> 676 raise error_class(parsed_response, operation_name) 677 else: 678 return parsed_response ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from model with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again.". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/torchserve-endpoint-2021-03-22-10-34-32 in account xxxxxxxxxxx for more information.

nahidalam commented 3 years ago

I am getting the exact same error. Basically, if the model container does not respond within 60 seconds, then you get the error. https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-container-response

A customer's model containers must respond to requests within 60 seconds. The model itself can have a maximum processing time of 60 seconds before responding to the /invocations. If your model is going to take 50-60 seconds of processing time, the SDK socket timeout should be set to be 70 seconds.

Not sure how to solve it yet though

axelsparr commented 3 years ago

In case you didn't solve it, have you tried pip installing captum in the Dockerfile? It seems that dependency is missing. In Dockerfile Change:

RUN pip install --no-cache-dir psutil \
                --no-cache-dir torch \
                --no-cache-dir torchvision

To:

RUN pip install --no-cache-dir psutil \
                --no-cache-dir torch \
                --no-cache-dir torchvision \
                --no-cache-dir captum \

You can also add other libraries there if you need to import them in your model.py or custom_handler.py functions

shashankprasanna / torchserve-examples

ModelError when calling the InvokeEndpoint #9

Error: