aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.07k stars 1.12k forks source link

PyTorchModel serverless deploy raises ClientError #3059

Open ManuConcepBrito opened 2 years ago

ManuConcepBrito commented 2 years ago

Describe the bug PytorchModel doesn't deploy in serverless mode.

To reproduce

image_uri = sagemaker.image_uris.retrieve(
        framework="pytorch",
        region="eu-central-1",
        py_version="py38",
        version='1.10',
        instance_type="ml.t2.xlarge",
        image_scope="inference"
    )
# not being used at the moment due to AWS bug
serverless_config = ServerlessInferenceConfig(
  memory_size_in_mb=4096,
  max_concurrency=10,
 )

  pytorch_model = PyTorchModel(model_data=path_to_s3,
                               role=role,
                               image_uri=image_uri,
                               source_dir="src",
                               entry_point="inference.py",
                               framework_version='1.10',
                               py_version='py38')
  predictor = pytorch_model.deploy(initial_instance_count=1, serverless_inference_config=serverless_config, endpoint_name=SERVERLESS_ENDPOINT_NAME)

Expected behavior Model deployed in serverless mode

Screenshots or logs

ClientError                               Traceback (most recent call last)
<ipython-input-54-593adb15b1aa> in <module>
     38                                  framework_version='1.10',
     39                                  py_version='py38')
---> 40     predictor = pytorch_model.deploy(initial_instance_count=1, serverless_inference_config=serverless_config, endpoint_name=SERVERLESS_ENDPOINT_NAME)
     41 else:
     42     pytorch_model = PyTorchModel(model_data='s3://detector-sagemaker/model.tar.gz',

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
   1024             wait=wait,
   1025             data_capture_config_dict=data_capture_config_dict,
-> 1026             async_inference_config_dict=async_inference_config_dict,
   1027         )
   1028 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict, async_inference_config_dict)
   3542         self.sagemaker_client.create_endpoint_config(**config_options)
   3543 
-> 3544         return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
   3545 
   3546     def expand_role(self, role):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
   3038 
   3039         self.sagemaker_client.create_endpoint(
-> 3040             EndpointName=endpoint_name, EndpointConfigName=config_name, Tags=tags
   3041         )
   3042         if wait:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    399                     "%s() only accepts keyword arguments." % py_operation_name)
    400             # The "self" in this scope is referring to the BaseClient.
--> 401             return self._make_api_call(operation_name, kwargs)
    402 
    403         _api_call.__name__ = str(py_operation_name)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    729             error_code = parsed_response.get("Error", {}).get("Code")
    730             error_class = self.exceptions.from_code(error_code)
--> 731             raise error_class(parsed_response, operation_name)
    732         else:
    733             return parsed_response

ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: One or more endpoint features are not supported using this configuration

System information A description of your system. Please provide:

Additional context The error happens also when specifying a CPU instance type (as I have read that GPU is not supported in serverless, not sure if this is still the case). Also logs are quite cryptic so I am not sure how to pinpoint the issue.

schematical commented 1 year ago

If anyone hits this in the future it turns out when creating the Endpoint Config manually in the UI I could trigger this error "Models requiring any of the following features are not supported on serverless endpoints: AWS marketplace model packages, private Docker registries, Multi-Model Endpoints, VPC configuration, network isolation."

But you dont get that if you use Terraform(and I am assuming the SDK/CLI). In my case I have a VPC Config. So double check you don't have any of those settings that are mentioned in that error message.