aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.1k stars 1.14k forks source link

ClientError: An error occurred when calling the CreateEndpointConfig operation: MultiModel mode is not supported for instance type ml.g4dn.xlarge. #1323

Open dshahrokhian opened 4 years ago

dshahrokhian commented 4 years ago

Describe the bug Apparently, MultiModel mode is not supported in any of the GPU instance types. This is nowhere mentioned in the documentation.

To reproduce

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType': 'ml.g4dn.2xlarge',
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName': model_name,
        'VariantName': 'AllTraffic'}])

System information A description of your system. Please provide:

knakad commented 4 years ago

Thanks for reaching out! This feature is only supported on CPU instance types.

We'll fix the documentation to clarify this. Thanks for bringing it to out attention!

Internal Reference: SIMT-P33776646

ParthBarot-BoTreeConsulting commented 4 years ago

@knakad We are trying to create ml.p3.2xlarge but facing same issue.

Describe the bug

How to increase speed of prediction with SageMaker+MMS, without GPU?

Basically, we have a model with MMS on ml.c5.2xlarge instance but it takes around 30-35s when we do prediction using Detectron2. We want to decrease this time, so trying GPU instances but SageMaker is not allowing to use GPU with MMS! when tested with Google Colab, the same prediction works in 3-4s only.

What options we could think of, can you please put some light on this?

  1. Move from MMS to single endpoint - This will allow to use GPU.
  2. Move to EC2 with Elastic Inference - I think this should be the last resort.
  3. Is there any other way we could use CPU with MMS on SageMaker, and improve the speed? We tried even with c5.9xlarge but didnt see much improvement.

System information

SageMaker Python SDK version: latest Framework name (eg. PyTorch) or algorithm (eg. Faster RCNN): Detectron2 Framework version: detectron2==0.2.1+cu101 Python version: 3.6 CPU or GPU: GPU Custom Docker image (Y/N): Y

Thanks