Closed waytrue17 closed 1 year ago
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
LGTM
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
Issue #, if available: MMS and TorchServe store request information in context object and pass it to the handler service function. This PR further pass the context to the handler function so that users can use those information in the custom handler function.
One particular use case is multi-GPU inference for PyTorch and MXNet. Currently handler functions statically select the same GPU device and assign all data/model to that single device even on a multi-GPU host. MMS and TorchServe dynamically select GPU device and store that information in
context
. By passingcontext
to handler function, users can choose to assign data/model to different GPUs and utilize all GPU resources.Below is an example of
input_fn
that process and assign data to device:After this PR, handler functions with/without context will be both supported. So that it won’t break existing use cases.
Description of changes:
Testing done:
With context passed, workers were assigned to different gpus as expected:
Without context, all workers loaded model to GPU 0 and it burns out the dedicated memory (14936MiB / 16384MiB)
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.