awslabs / multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Apache License 2.0
994 stars 231 forks source link

[Q] GPU support #938

Open oonisim opened 4 years ago

oonisim commented 4 years ago

AWS documentation (https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html) tells "Multi-model endpoints are not supported on GPU instance types.".

Kindly explain if it is not technically possible or not yet implemented.

vinayak-shanawad commented 2 years ago

Hi @oonisim

Do you know, how can we get the inference from multi-model endpoints which require GPU memory?

Thanks

oonisim commented 2 years ago

Hi @Vinayaks117 , As per AWS documentation (https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html] "Multi-model endpoints are not supported on GPU instance types", not sure if you can run multi model server (please see the AWS github for the multi model server implementation, and I believe it is framework e.g. PyTorch, TF dependent) on GPU instances. Please open a case to AWS support for a correct answer. I am afraid it is the only way.

vinayak-shanawad commented 2 years ago

Sure Thanks @oonisim