tnc-ca-geo / animl-ml

Machine Learning resources for camera trap data processing
Other
4 stars 1 forks source link

deploying serverless mira endpoint with TensorflowModel class results in permissions error #85

Closed rbavery closed 1 year ago

rbavery commented 2 years ago
UnexpectedStatusException: Error hosting endpoint mira-large-keras-endpoint-serverless-tfmodel2022-07-06-22-49-53: Failed. Reason: Received server error (0) from model with message "An error occurred while handling request as the model process exited.". See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/mira-large-keras-endpoint-serverless-tfmodel2022-07-06-22-49-53 in account 830244800171 for more information..

https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/mira-large-keras-endpoint-serverless-tfmodel2022-07-06-22-49-53

two errors in the logs jump out

ERROR: Could not install packages due to an EnvironmentError: [Errno 13]

this could be because sagemaker serverless doesn't support pip installing requirements (located in the export folder next to inference.py). It probably only supports spinning up a container and executing inference.py. I think this requires making a custom container for the MIRA models rather than using the sagemaker containers

location / {return 404 '{"error": "Not Found"}';

I'm not sure what the above error is about. the only thing changed was adding a serverless config to the TensorflowModel.

predictor = sagemaker_model.deploy(endpoint_name = endpoint_name,
                                   serverless_inference_config=ServerlessInferenceConfig(memory_size_in_mb=2048, max_concurrency=5)
                                   )
nathanielrindlaub commented 1 year ago

closing out as we're moving towards MIRAv2 which is PyTorch based.