Closed amaharek closed 3 years ago
@amaharek We had the same issue and fixed with this
TOOLKIT_PATH=python -c "import sagemaker_inference;print(sagemaker_inference.__path__[0])"
Add the following to the Dockerfile and build a new image based on the above
RUN echo "vmargs=-XX:-UseContainerSupport" >> $TOOLKIT_PATH/etc/default-mms.properties
RUN echo "vmargs=-XX:-UseContainerSupport" >> $TOOLKIT_PATH/etc/mme-mms.properties Correspondence
UserContainerSupport
https://www.eclipse.org/openj9/docs/xxusecontainersupport/ https://blog.softwaremill.com/docker-support-in-new-java-8-finally-fd595df0ca54
Describe the bug This issue is related to the issue https://github.com/aws/sagemaker-python-sdk/issues/1275
JVM detect the CPU count as 1 when more CPUs are available for the container.
To reproduce
Number of CPUs: 1
Expected behavior The CPU count from CloudWatch should match the CPU count for the used instance. For example,
4
if the instance isml.m4.xlarge
System information Container: pytorch-inference:1.5-gpu-py3 SageMaker inference v1.1.2