aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131 stars 70 forks source link

add vmargs=-XX:-UseContainerSupport in config #136

Closed lxning closed 1 year ago

lxning commented 1 year ago

Issue #, if available:

99

Description of changes: Apply the fixing in pytorch inference toolkit

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

davidthomas426 commented 1 year ago

I'll just note that this issue along with workarounds and fixes have shown up across different inference toolkits.

Here is a list of links showing that it's been a recurring problem.

Also, this fix may create other problems, as turning off container support means that the JVM does not respect Docker container memory limits.

We should make sure to address this uniformly across the inference toolkits and deep-learning-containers, while allowing users to easily customize without needing onerous workarounds such as using derived deep-learning-container images or even forking toolkit.

Links:

I still don't think this is an exhaustive list.

chen3933 commented 1 year ago

The PR will be updated to allow customization of vmargs Example : https://github.com/aws/sagemaker-inference-toolkit/pull/118/commits/c01dde749687f86e7891ec403eed3f98d4fcfb50

rohithkrn commented 1 year ago

python 3.7 tests failing at coverage report step. Failing to invoke coverage command. Works fine for python3.6 ERROR: InvocationError for command /codebuild/output/src309395522/src/github.com/aws/sagemaker-pytorch-inference-toolkit/.tox/py37/bin/coverage report --fail-under=90 --include '*sagemaker_pytorch_serving_container*' (exited with code 1)