aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131 stars 70 forks source link

Multi gpu support #127

Open waytrue17 opened 2 years ago

waytrue17 commented 2 years ago

Description of changes: Enabling multi gpu support. It passes context information to hander functions so that model/data can be assigned to multiple gpu devices.

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    396466      C   /opt/conda/bin/python3.8         1729MiB |
|    1   N/A  N/A    396465      C   /opt/conda/bin/python3.8         1729MiB |
|    1   N/A  N/A    396467      C   /opt/conda/bin/python3.8         1729MiB |
|    2   N/A  N/A    396470      C   /opt/conda/bin/python3.8         1729MiB |
|    3   N/A  N/A    396462      C   /opt/conda/bin/python3.8         1729MiB |
|    4   N/A  N/A    396469      C   /opt/conda/bin/python3.8         1729MiB |
|    5   N/A  N/A    396463      C   /opt/conda/bin/python3.8         1729MiB |
|    6   N/A  N/A    396468      C   /opt/conda/bin/python3.8         1729MiB |
|    7   N/A  N/A    396464      C   /opt/conda/bin/python3.8         1729MiB |
+-----------------------------------------------------------------------------+

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

ashishgupta023 commented 2 years ago

have you tested below inference use cases with the DLC container

1) customer provides an inference script with context (new way) ? 2) customer provides an inference script without the context (old way) ?

could you please attach the test details on the description ?

ashishgupta023 commented 2 years ago

I think this change will also be required for the MXNET DLC containers with MMS, instead of adding a new transformer and adapting the handler service in the pytorch toolkit, could we consider adapting the transformer and handler service in the inference toolkit to work with the context so that this change could be applied to both ? It would also help make it less error prone in the future since the change would be at a single place.

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

waytrue17 commented 2 years ago

I think this change will also be required for the MXNET DLC containers with MMS, instead of adding a new transformer and adapting the handler service in the pytorch toolkit, could we consider adapting the transformer and handler service in the inference toolkit to work with the context so that this change could be applied to both ? It would also help make it less error prone in the future since the change would be at a single place.

Makes sense. I will split the code and re-run some tests. Will post the test results afterward

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 2 years ago

AWS CodeBuild CI Report

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository