I am trying to deploy a yoloV5 model in SageMaker following this notebook. The endpoint is successfully deployed but when I am trying to test the endpoint using predictor.predict(payload) it is showing this error:
ModelError Traceback (most recent call last)
Cell In[196], line 1
----> 1 result = predictor.predict(payload)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/botocore/client.py:565, in ClientCreator._create_api_method.._api_call(self, *args, **kwargs)
561 raise TypeError(
562 f"{py_operation_name}() only accepts keyword arguments."
563 )
564 # The "self" in this scope is referring to the BaseClient.
--> 565 return self._make_api_call(operation_name, kwargs)
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from primary with message "Your invocation timed out while waiting for a response from container primary. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again.". See https://us-east-1.console.aws.amazon.com/cloudwatch log stream`
When I looked into the error logs, it is showing:
stdout MODEL_LOG - FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/model/code/best.pt'
The file structure while creating the tar file is:
I am trying to deploy a yoloV5 model in SageMaker following this notebook. The endpoint is successfully deployed but when I am trying to test the endpoint using
predictor.predict(payload)
it is showing this error:ModelError Traceback (most recent call last) Cell In[196], line 1 ----> 1 result = predictor.predict(payload)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/sagemaker/base_predictor.py:212, in Predictor.predict(self, data, initial_args, target_model, target_variant, inference_id, custom_attributes, component_name) 209 if inference_component_name: 210 request_args["InferenceComponentName"] = inference_component_name --> 212 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args) 213 return self._handle_response(response)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/botocore/client.py:565, in ClientCreator._create_api_method.._api_call(self, *args, **kwargs)
561 raise TypeError(
562 f"{py_operation_name}() only accepts keyword arguments."
563 )
564 # The "self" in this scope is referring to the BaseClient.
--> 565 return self._make_api_call(operation_name, kwargs)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/botocore/client.py:1021, in BaseClient._make_api_call(self, operation_name, api_params) 1017 error_code = error_info.get("QueryErrorCode") or error_info.get( 1018 "Code" 1019 ) 1020 error_class = self.exceptions.from_code(error_code) -> 1021 raise error_class(parsed_response, operation_name) 1022 else: 1023 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from primary with message "Your invocation timed out while waiting for a response from container primary. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again.". See https://us-east-1.console.aws.amazon.com/cloudwatch log stream`
When I looked into the error logs, it is showing:
stdout MODEL_LOG - FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/model/code/best.pt'
The file structure while creating the tar file is:
model.tar.gz ├─ code/ ├── inference.py ├── requirements.txt └── best.pt
I have even tried it with best.pt out side the code folder according to this article: https://aws.amazon.com/blogs/machine-learning/hosting-yolov8-pytorch-model-on-amazon-sagemaker-endpoints/ model.tar.gz ├─ code/ │ ├── inference.py │ └── requirements.txt └── best.pt
still faced same issue.