ClientError: An error occurred when calling the CreateEndpointConfig operation: MultiModel mode is not supported for instance type ml.g4dn.xlarge.

dshahrokhian commented 4 years ago

Describe the bug Apparently, MultiModel mode is not supported in any of the GPU instance types. This is nowhere mentioned in the documentation.

To reproduce

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType': 'ml.g4dn.2xlarge',
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName': model_name,
        'VariantName': 'AllTraffic'}])

System information A description of your system. Please provide:

SageMaker Python SDK version: latest
Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
Framework version: 1.0
Python version: 3.6
CPU or GPU: GPU
Custom Docker image (Y/N): Y

knakad commented 4 years ago

Thanks for reaching out! This feature is only supported on CPU instance types.

We'll fix the documentation to clarify this. Thanks for bringing it to out attention!

Internal Reference: SIMT-P33776646

ParthBarot-BoTreeConsulting commented 4 years ago

@knakad We are trying to create ml.p3.2xlarge but facing same issue.

Describe the bug

How to increase speed of prediction with SageMaker+MMS, without GPU?

Basically, we have a model with MMS on ml.c5.2xlarge instance but it takes around 30-35s when we do prediction using Detectron2. We want to decrease this time, so trying GPU instances but SageMaker is not allowing to use GPU with MMS! when tested with Google Colab, the same prediction works in 3-4s only.

What options we could think of, can you please put some light on this?

Move from MMS to single endpoint - This will allow to use GPU.
Move to EC2 with Elastic Inference - I think this should be the last resort.
Is there any other way we could use CPU with MMS on SageMaker, and improve the speed? We tried even with c5.9xlarge but didnt see much improvement.

System information

SageMaker Python SDK version: latest Framework name (eg. PyTorch) or algorithm (eg. Faster RCNN): Detectron2 Framework version: detectron2==0.2.1+cu101 Python version: 3.6 CPU or GPU: GPU Custom Docker image (Y/N): Y

Thanks

aws / sagemaker-python-sdk

ClientError: An error occurred when calling the CreateEndpointConfig operation: MultiModel mode is not supported for instance type ml.g4dn.xlarge. #1323