aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131 stars 70 forks source link

how to use gpu in sagemaker instance #126

Open haiderasad opened 2 years ago

haiderasad commented 2 years ago

hi i am going with a custom docker image with all the cuda cudnn installed and also tested locally gpu is being utilized. but when upload to ecr and create endpoint it does not create endpoint and says kindly make sure docker serve command is valid , from debugging i came to found out that inference toolkit is needed inside image for the image to see if sagemaker gpu is avail or not, but there is no sample dockerfile from which i can understand , kindly tell 1)how to enable cuda support in custom built docker images for sagemaker 2)will using prebuilt images e.g accountnum.aws.amazon.com/pytorch:1.10-cuda113-py3 directly use cuda/gpu of sagemaker instance?

holopekochan commented 1 year ago

For inference endpoint, probably, you can use GPU instance in SingleModel mode, As I tried MultiModel mode with 763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker

It shows error ClientError: An error occurred (ValidationException) when calling the CreateEndpointConfig operation: MultiModel mode is not supported for instance type ml.g4dn.xlarge. from here, it said GPU instance is not supported https://github.com/aws/sagemaker-python-sdk/issues/1323